📄 Extracted Text (1,924 words)
From: Joscha Bach
Sent: Wednesday, March 1, 2017 7:29 PM
To: Jeffrey Epstein
Cc: Barnaby Marsh
Subject: Re:
Thank you, Jeffrey! This is from Noam, right? I would be very interested =n reading the responses of linguists and
computational language =odelers to this.
May I forward it to a friend at Google X?
Some notes:
> basic assumptions about human language that should I think be =ncontroversial, extensively discussed elsewhere, then
turning to a =ample of challenges. A person's language is an internal =ystem, part of human biology, based on
configurations of the brain, =eveloping in each person through interaction of specific biological =ndowment (the topic of
UG — universal grammar in contemporary =erminology), external environment, and general properties and =rinciples of
growth and development that are independent of language. =
As far as I understand, there is not yet an agreement among linguists =rt. UG, i.e. how much is innate vs. do humans just
converge on the =implest type 3 grammar that is consistent with the constraints they =bserve in their local environment.
I think Noam argues that we have =ery specific circuitry for language, whereas the other camp would =uggest that we
are general learners, with specific rewards that bias us =owards compositionality and systematicity. OTOH, this might
also be =ead as a variant of Noam's "Strong Minimalist ThesisQ=9D (SMT).
The controversy will be eventually resolved by progress in building =ystems that learn natural language.
> The acquired system is an "internal language" =I-language), a computational system that yields an infinite array of
=ierarchically structured expressions that are interpreted at the =onceptual-intentional CI interface as, in effect, a
"language =f thought" (LOT), and that can be externalized to one or =nother sensorymotor system, typically sound. Also
relevant are some =onsiderations about evolution of language.
> Little is known about the evolution of cognitive faculties, a matter =iscussed in an important article by Richard
Lewontin, whose own view =or the prospects was dim
Most folks in cognitive science would probably agree that most cortical =ctivity is devoted to building a generative
simulation of the outside =orld by a process of hierarchical learning. These simulations can be =apped on a conceptual
manifold, something like an address space of our =ensory motor representations of the world, which we can use to
evoke =nd shape our mental simulations. Language is our interface to that =onceptual space, and external language
allows us to synchronize =oncepts even in the absence of matching sensory motor representations, =.e. we can build
mental simulations of things that we never =xperienced, by interpolating between concepts that address mental
=imulations we know.
It seems that Noam's approach is unique in that he focuses =ntirely on language and concepts, while treating the
understanding of =he underlying cognitive faculties as hopeless, while many others would =rgue that understanding
language without first understanding =re-linguistic mental representations might be impossible.
That said, Noam's characterization of I-language and LOT at the =E24140conceptual-intentional" interface, with an
=xternalization through generative mechanisms, is probably a useful =asis, regardless of where individual researchers
come from.
> [i] Anatomically modern humans (AMH) appear about 200 thousand years =go.
I
EFTA_R1_01904794
EFTA02658235
> [ii] The faculty of language FL appears to be a true species property: =hared among human groups (with limited
individual differences) and in =11 essential respects, unique to humans. In particular, there is no =eaningful evidence for
existence of language prior to AMH.
> [iii] Recent genomic studies indicate that some human groups (San =eople) separated from other humans about
150kya. As far as we know, =hey share FL with other human groups.
> [iv] The San languages are all and only those that have the curious =roperty of phonetic clicks, and there may be some
articulatory =daptation to producing them (See Huijbregts, forthcoming).
Nguni languages have clicks, too, but they seem to have imported them =rom Khoisan.
> [v] The first (very limited) indication of some form of symbolic =ehavior appears at about 75kya. Not long after that,
we have rich =vidence of quite extraordinary creative achievements (Lascaux, etc.).
This is consistent with another observation: Modern humans had a =opulation bottleneck of 2000-3000 individuals ca
75000 years ago, which =oincides with the Tonga eruption. This does not necessarily mean that =he volcano killed off
almost all hominids, but it increased the =volutionary pressure, and it is possible that our ancestors evolved a =utation
that enabled them to outcompete and kill most of the hominid =ompetition (including Neanderthals). What if that
mutation is something =hat roughly translates into "symbolic behavior"?
I currently think that much of our civilization might be the result of a series of quite specific mutations. Our ancestors
went from 3000 =ndividuals to one million and remained there until they developed =eligions. Religion and other
ideologies are based on a need for =onformance to internalized norms, i.e. an innate desire to serve as =art of a system
that is larger than the individual's reputation based =roup. They were also based on a shared conceptual space.
Challenge 1 seems mostly to amount to: verify that 1. all human groups =ave language, and 2. there is no grammatical
non-human language. One of =he interesting questions might be if dolphins have grammatical =anguage, another one
concerns the limits of learning in non-human =rimates. The challenge is completely empirical.
Challenge 2 seems very exciting to me; I read it as: has language =ntrinsically linear order, or is that only imposed by the
=equentialization of articulation? Grammatical language has a tree =tructure, and the tree seems to be created
probabilistically in the =istener, from a string of discrete symbols. Would natural language be =earnable without the
constraints of sequentiality and discreteness?
Challenge 3: do we need externalization to learn and process language? I =ould suspect that an individual can play a
language game against itself =ntil it converges on its own language, but it is not clear that humans =re among the class
of individuals that can do that from scratch. Most =esearch suggests that there is a critical window in which we must
pick =p our first language for perfect fluency, and there seems to be no =vidence of entirely individualistic
acquisition/formation of a first =anguage. If that is true, is that a constraint of the way language =earning is
implemented in the human brain, or a complexity constraint =ithin language itself?
It seems to be clear that learning a programming language changes the =ay we think, i.e. it provides evidence for a weak
version of the Sapir =horf hypothesis. But that is not so much a constraint of =xternalization, but of the semantic
structures addressed by the =anguage.
I imagine that pure work in a computer science lab can make some =nteresting progress on challenges 2 and 3.
Challenge 4: I don't understand enough about the context to see =he significance yet; I would think that once we have
an SMT model of =anguage formation, we can learn additional operations that perform =perations on the generated
2
EFTA_R1_01904795
EFTA02658236
mental representation, based on arbitrary =ignals. This may require us to leave an approach that attempts to =andbox
language from general cognition, but why would we want to =onstrain SMT based models by such a sandbox?
Challenge 5: Again, I don't understand enough of the context to =nderstand why probabilistic interpretation cannot fill in
the gaps. A =robabilistic model will weight alternatives, and the binary Merge is =he simplest, preferred case?
Challenge 6: The question of the structure of individual lexical items =ight require a perspective that integrates mental
representations =eyond language/SMT.
Challenge 7: Do semantic atoms refer to the external world ="referential doctrine")? — This seems to be =uite clearly
false; they refer to representations in the neocortex that =re mutable and acquired through learning (structure or
reinforcement) =nd inference.
Challenge 8: Noam seems to agree with my take on 7. How are semantic =tems acquired? — This challenge comes down
to the general =roblems of learning and perception, i.e. pretty much everything in =ognitive science outside of language!
Challenge 8 seems to be designed =y a rocket scientist who specializes in combustion chambers and leaves =11 other
parts of getting the rocket to fly as an exercise to their =rad student...
Challenge 9: Noam suggests that meaning must be derived from innate =nformation, and wants to study universals
between language to identify =he innate bits. However, it is not clear if they do not stem from the =roperties of
mathematics, i.e. there is a limited space of "usefu= simple axiomatic systems" that can be individually explored by
=earning systems. Kant attempted to describe this space, identified it =s apriori and synthetic, and listed the basic
structural categories =hat we would use to characterize the world. Sowa and a few others have =ade contributions to
basic ontologies, and perhaps it is time to =evisit Kant's project?
Challenge 10: Do music, planning, arithmetic stem from language, or do =11 result from a shared innovation of modern
hominid brains?— =bviously, different answers in that space might be possible, for =nstance music could be a parasitic
byproduct of rewards for discovering =ompositional representations that our brain needs to make us interested =n
learning grammar, while basic planning is independent, and complex =lanning needs language for structuring and
operating on the conceptual =pace. This makes the question extremely general.
It also gives rise to the more general question of what exactly makes =omo sapiens different from the other
chimpanzees. I suspect that our =rains are trained layer by layer, whereby each layer has a time of high =lasticity during
its primary training phase, then undergoes synaptic =runing, and has low plasticity later on. The duration of the training
=hases is regulated by genetic switches. Increasing the duration will =xtend infancy and childhood (i.e. increase the cost
of upbringing), but =ive each layer more training data. Perhaps humans outperform other apes =ecause they get a
magnitude more training data before their brains lose =nfant plasticity, which results in dramatically better ability to
=eneralize and abstract?
Challenge 11: Rare constructions can be understood by children, and thus =here should be a mechanism to derive them
from more simple rules, =espite apparent evidence to the contrary, which should be explained =away].
Challenge 12: Noam suggests that the complexity of most constructions in =he face of "poverty of stimuli" means that (-
languages =re 1. very similar, 2. differences result from externalization, 3. =hould therefore stem from UG. He wants this
shown, or an alternative.
An alternative explanation might be that the space of possible human =rammars is small enough to allow rapid
convergence, and in polyglots =ven allow for a complete mapping. That would not be a property of an =volutionary-
engineering UG, but an apriori of the mathematics of human =rammars.
Challenge 13: What small change in a brain could lead to the unique =ognitive abilities of homo sapiens, including
language? — There =re a lot of different hypotheses of this, among them what I suggest in =10), and differential
3
EFTA_R1_01904796
EFTA02658237
attention/reward for learning compositional =tructures, or several successive modifications in the reward system. I
=hink that Noam suspects that the culprit is a new connective pathway, =erhaps somewhat similar to Julian Jayne's
Bicameral Mind =ypothesis?
These challenges are extremely inspiring food for thought!
Bests,
Joscha
> Am Mar 1, 2017 um 7:01 AM schrieb jeffrey E. <[email protected]>:
> <Challenges Language 2.17.docx>
<?xml version=.0" encoding=TF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.O.dtd">
<plist version=.0">
<dict>
<key>conversation-id</key>
<integer>52117</integer>
<key>date-last-viewed</key>
<integer>0</integer>
<key>date-received</key>
<integer>1488396562</integer>
<key>flags</key>
<integer>8590195717</integer>
<key>gmail-label-ids</key>
<array>
<integer>6</integer>
<integer>2</integer>
</array>
<key>remote-id</key>
<string>692309</string>
</dict>
</plist>
4
EFTA_R1_01904797
EFTA02658238
ℹ️ Document Details
SHA-256
dccea4e9d94bb8508d29224ee8b2eaa3cfb053079ee6296609213cb6c118686b
Bates Number
EFTA02658235
Dataset
DataSet-11
Document Type
document
Pages
4
Comments 0