📄 Extracted Text (1,973 words)
From: jeffrey E. <[email protected]>
Sent: Wednesday, March 1, 2017 8:09 PM
To: Joscha Bach
Subject: Re:
yes , confidentiL
On Wed, Mar 1, 2017 at 2:29 PM, Joscha Bach < > wrote:
Thank you, Jeffrey! This is from Noam, right? I would be very interes=ed in reading the responses of linguists and
computational language modele=s to this.
May I forward it to a friend at Google X?
Some notes:
> basic assumptions about human language that should I think be uncontro=ersial, extensively discussed
elsewhere, then turning to a sample of chall=nges. A person's language is an internal system, part of hum=n biology,
based on configurations of the brain, developing in each person=through interaction of specific biological endowment
(the topic of UG =80 universal grammar in contemporary terminology), external environment= and general properties
and principles of growth and development that are =ndependent of language.
As far as I understand, there is not yet an agreement among linguists wrt. =G, i.e. how much is innate vs. do
humans just converge on the simplest typ= 3 grammar that is consistent with the constraints they observe in their kcal
environment. I think Noam argues that we have very specific circuitry =or language, whereas the other camp would
suggest that we are general lear=ers, with specific rewards that bias us towards compositionality and syste=aticity.
OTOH, this might also be read as a variant of Noam's =80 Strong Minimalist Thesis" (SMT).
The controversy will be eventually resolved by progress in building systems=that learn natural language.
> The acquired system is an "internal language" (l-langu=ge), a computational system that yields an infinite array
of hierarchicall= structured expressions that are interpreted at the conceptual-intentional=Cl interface as, in effect, a
"language of thought" (LOT),=and that can be externalized to one or another sensorymotor system, typica=ly sound.
Also relevant are some considerations about evolution of I=nguage.
> Little is known about the evolution of cognitive faculties, a matter d=scussed in an important article by Richard
Lewontin, whose own view for th= prospects was dim
Most folks in cognitive science would probably agree that most cortical act=vity is devoted to building a
generative simulation of the outside world b= a process of hierarchical learning. These simulations can be mapped on a
=onceptual manifold, something like an address space of our sensory motor r=presentations of the world, which we can
use to evoke and shape our mental=simulations. Language is our interface to that conceptual space, and exter=al
language allows us to synchronize concepts even in the absence of match=ng sensory motor representations, i.e. we can
build mental simulations of =hings that we never experienced, by interpolating between concepts that ad=ress mental
simulations we know.
It seems that Noam's approach is unique in that he focuses entirely=on language and concepts, while treating
the understanding of the underlyi=g cognitive faculties as hopeless, while many others would argue that understanding
language without first understanding pre-linguistic mental repres=ntations might be impossible.
EFTA_R1_01905083
EFTA02658453
That said, Noam's characterization of I-language and LOT at the =80 conceptual-intentional" interface, with an
externalization th=ough generative mechanisms, is probably a useful basis, regardless of wher= individual researchers
come from.
> [i] Anatomically modern humans (AMH) appear about 200 thousand years a=o.
> [ii] The faculty of language FL appears to be a true species property:=shared among human groups (with
limited individual differences) and in all=essential respects, unique to humans. In particular, there is no mea=ingful
evidence for existence of language prior to AMH.
> [iiiJ Recent genomic studies indicate that some human groups (San peop=e) separated from other humans
about 150kya. As far as we know, they=share FL with other human groups.
> [iv] The San languages are all and only those that have the curious pr=perty of phonetic clicks, and there may
be some articulatory adaptation to=producing them (See Huijbregts, forthcoming).
Nguni languages have clicks, too, but they seem to have imported them from =hoisan.
> [vi The first (very limited) indication of some form of symbolic behav=or appears at about 75kya. Not long
after that, we have rich evidenc= of quite extraordinary creative achievements (Lascaux, etc.).
This is consistent with another observation: Modern humans had a population=bottleneck of 2000-3000
individuals ca 75000 years ago, which coincides wi=h the Tonga eruption. This does not necessarily mean that the
volcano kill=d off almost all hominids, but it increased the evolutionary pressure, and=it is possible that our ancestors
evolved a mutation that enabled them to =utcompete and kill most of the hominid competition (including
Neanderthals=. What if that mutation is something that roughly translates into =9Csymbolic behavior"?
I currently think that much of our civilization might be the result of a se=ies of quite specific mutations. Our
ancestors went from 3000 individuals =o one million and remained there until they developed religions. Religion =nd
other ideologies are based on a need for conformance to internalized no=ms, i.e. an innate desire to serve as part of a
system that is larger than=the individual's reputation based group. They were also based on a sha=ed conceptual space.
Challenge 1 seems mostly to amount to: verify that 1. all human groups have=language, and 2. there is no
grammatical non-human language. One of the in=eresting questions might be if dolphins have grammatical language,
another=one concerns the limits of learning in non-human primates. The challenge i= completely empirical.
Challenge 2 seems very exciting to me; I read it as: has language intrinsic=lly linear order, or is that only imposed
by the sequentialization of arti=ulation? Grammatical language has a tree structure, and the tree seems to =e created
probabilistically in the listener, from a string of discrete sym=ols. Would natural language be learnable without the
constraints of sequen=iality and discreteness?
Challenge 3: do we need externalization to learn and process language? I wo=ld suspect that an individual can
play a language game against itself unti= it converges on its own language, but it is not clear that humans are amo=g the
class of individuals that can do that from scratch. Most research su=gests that there is a critical window in which we
must pick up our first l=nguage for perfect fluency, and there seems to be no evidence of entirely =ndividualistic
acquisition/formation of a first language. If that is true=is that a constraint of the way language learning is implemented
in the hu=an brain, or a complexity constraint within language itself?
It seems to be clear that learning a programming language changes the way w= think, i.e. it provides evidence
for a weak version of the Sapir Whorf hy=othesis. But that is not so much a constraint of externalization, but of t=e
semantic structures addressed by the language.
2
EFTA_R1_01905084
EFTA02658454
I imagine that pure work in a computer science lab can make some interestin= progress on challenges 2 and 3.
Challenge 4: I don't understand enough about the context to see the=significance yet; I would think that once
we have an SMT model of language=formation, we can learn additional operations that perform operations on t=e
generated mental representation, based on arbitrary signals. This may re=uire us to leave an approach that attempts to
sandbox language from genera= cognition, but why would we want to constrain SMT based models by such a =andbox?
Challenge 5: Again, I don't understand enough of the context to und=rstand why probabilistic interpretation
cannot fill in the gaps. A probabi=istic model will weight alternatives, and the binary Merge is the simplest= preferred
case?
Challenge 6: The question of the structure of individual lexical items migh= require a perspective that integrates
mental representations beyond langurge/SMT.
Challenge 7: Do semantic atoms refer to the external world ("refere=tial doctrine")? — This seems to be quite
clearly false; t=ey refer to representations in the neocortex that are mutable and acquired=through learning (structure
or reinforcement) and inference.
Challenge 8: Noam seems to agree with my take on 7. How are semantic items =cquired? — This challenge comes
down to the general problems of le=rning and perception, i.e. pretty much everything in cognitive science out=ide of
language! Challenge 8 seems to be designed by a rocket scientist wh= specializes in combustion chambers and leaves all
other parts of getting =he rocket to fly as an exercise to their grad student...
Challenge 9: Noam suggests that meaning must be derived from innate informa=ion, and wants to study
universals between language to identify the innate=bits. However, it is not clear if they do not stem from the properties
of =athematics, i.e. there is a limited space of "useful simple axioma=ic systems" that can be individually explored by
learning systems.=Kant attempted to describe this space, identified it as apriori and synthe=ic, and listed the basic
structural categories that we would use to charac=erize the world. Sowa and a few others have made contributions to
basic on=ologies, and perhaps it is time to revisit Kant's project?
Challenge 10: Do music, planning, arithmetic stem from language, or do all =esult from a shared innovation of
modern hominid brains? — Obvious=y, different answers in that space might be possible, for instance music c=uld be a
parasitic byproduct of rewards for discovering compositional repr=sentations that our brain needs to make us interested
in learning grammar,=while basic planning is independent, and complex planning needs language f=r structuring and
operating on the conceptual space. This makes the questi=n extremely general.
It also gives rise to the more general question of what exactly makes homo =apiens different from the other
chimpanzees. I suspect that our brains are=trained layer by layer, whereby each layer has a time of high plasticity d=ring
its primary training phase, then undergoes synaptic pruning, and has =ow plasticity later on. The duration of the training
phases is regulated b= genetic switches. Increasing the duration will extend infancy and childho=d (i.e. increase the cost
of upbringing), but give each layer more trainin= data. Perhaps humans outperform other apes because they get a
magnitude m=re training data before their brains lose infant plasticity, which results=in dramatically better ability to
generalize and abstract?
Challenge 11: Rare constructions can be understood by children, and thus th=re should be a mechanism to
derive them from more simple rules, despite ap=arent evidence to the contrary, which should be explained [away).
Challenge 12: Noam suggests that the complexity of most constructions in th= face of "poverty of stimuli" means
that I-languages are 1= very similar, 2. differences result from externalization, 3. should there=ore stem from UG. He
wants this shown, or an alternative.
3
EFTA_R1_01905085
EFTA02658455
An alternative explanation might be that the space of possible human gramma=s is small enough to allow rapid
convergence, and in polyglots even allow =or a complete mapping. That would not be a property of an evolutionary-
eng=neering UG, but an apriori of the mathematics of human grammars.
Challenge 13: What small change in a brain could lead to the unique cogniti=e abilities of homo sapiens,
including language? — There are a lot=of different hypotheses of this, among them what I suggest in (10), and
di=ferential attention/reward for learning compositional structures, or sever=l successive modifications in the reward
system. I think that Noam suspect= that the culprit is a new connective pathway, perhaps somewhat similar to=Julian
Jayne's Bicameral Mind hypothesis?
These challenges are extremely inspiring food for thought!
Bests,
Joscha
> Am Mar 1, 2017 um 7:01 AM schrieb jeffrey E. <[email protected] <mailto:[email protected]»:
> <Challenges Language 2-17.docx>
=AO please note
The information contained i= this communication is confidential, may be attorney-client privileged,=may constitute
inside information, and is intended only for the use 4 the addressee. It is the property of JEE Unauthorized use,
disclos=re or copying of this communication or any part thereof is strictly pro=ibited and may be unlawful. If you have
received this communication =n error, please notify us immediately by return e-mail or by e-mail to =a
href="[email protected]" target="_blank">jeevacation@gmai=.com, and destroy this communication and
all copies thereof, inc=uding all attachments. copyright -all rights reserved
4
EFTA_R1_01905086
EFTA02658456
ℹ️ Document Details
SHA-256
8843d69cb91d1dea73bfb2b459dfac13e04cf2bfb97fbc2f632263f6a8f52ed5
Bates Number
EFTA02658453
Dataset
DataSet-11
Document Type
document
Pages
4
Comments 0