The Rise, Fall and Rise of the Specialty of Speech


For over 50 years attempts have been made to determine if humans possess a ‘special,’ dedicated neural system for processing speech or if we use a general mechanism for all sounds, including speech. The theory of a dedicated speech-processing substrate has risen to have major influence only to fall to unpopularity and remarkably, is now regaining consideration. One consistent property of the theory, however, is controversy.

Listening to and understanding someone who is speaking probably couldn’t seem more natural and effortless. We do it even when we don’t even want to – think of noisy neighbours. Yet, in comparison with other sounds that are meaningful to us, human speech is a formidably difficult stimulus to extract information from. Unlike written language, speech is a continuous onslaught of many varying sound properties, each contributing to the intended message. It advances rapidly through time, mostly without pauses between words. And if any slight variations in its sonic attributes are not correctly perceived, the consequences could be dire.

In fact, the idea that speech processing was in some way unique is rooted in early attempts at text-to-speech technology. It was in the 50s, long before today’s computing capacity, that Alvin Liberman began designing and testing speech readers for the blind. Little did he know that such an endeavor would lead him to develop one of the most widely known theories of how we perceive speech – The Motor Theory of Speech Perception. Amazingly, this theory holds that when we perceive speech we are actually perceiving vocal tract gestures – the physical motor articulations that gave rise to the sounds we hear – rather than extracting meaning directly from the sounds emitted by the speaker. The evidence for this is so striking that it bears the primary implication for a ‘special’ innate module for processing speech.

The Motor Theory of Speech Perception

When Alvin Leberman tested out his text-to-speech reader he was met with disappointing results. At that time, the idea was to convert text into a series of tones, and with some practice, one could recognize the words that were coded by a unique sequence of these tones. Despite practice, however, it wasn’t feasible. The tones just sounded like a very rapid series of bleeps and blurps, too fast to be intelligible. Perhaps the task would have been like trying to understand R2D2 if he was somehow reading these very lines. Participants simply did not have the temporal resolving power to decipher the sounds when they were presented at a practical rate.

How could this happen? Speech, even at relatively slow rates, involves more complex sound patterning than the tones from Liberman’s text reader. When he investigated the acoustic patterns of speech, what he found was surprising. He discovered  a phenomenon now called ‘coarticulation.’ It occurs when distinctive speech sounds overlap in time – we co-articulate multiple, unique, speech components simultaneously, lengthening their duration and thereby making them easier to hear. This phenomenon underlies much of the complexity of the auditory speech signal and forms a main root of the Motor Theory.

Coarticulation lengthens sounds but causes variability – the kind of variability that creates ambiguity. That is, there are multiple acoustic cues for a given speech sound, and likewise, a single acoustic cue can be perceived as different speech sounds. An example is the word “say,” for which it’s component sounds can be perceived as “day” or “stay.” How then do we disambiguate the speech stream? Critically, Liberman realized that it is the vocal gestures that create speech sounds that are that are reliable cues to what the speaker intended to communicate. That is, if we can’t rely on the acoustic cues, we can rely on perception of the movements, or the ‘articulations’ that give rise to the acoustics of speech to be able to understand it. If we can understand someone talking, our perception must track the vocal gestures and not the ambiguous acoustic cues. And hence, the Motor Theory of Speech came into being in 1957.

Speech is special: Evidential highlights

On the surface, the theory admittedly seems odd and unlikely, but it has garnered some intriguing evidence. One line of support, discovered by Liberman himself, was elementary speech sounds (which can be shorter than single words) seemed to be perceived categorically rather than continuously. This means the threshold between perceiving one speech sound rather than another is discretized – black and white. Thus, this aspect of our perception of speech runs contrary to very basic acoustic properties that actually comprise it., Take the properties of loudness and pitch, for example. These properties are perceived on continuous levels, as all the shades of grey between black and white are. Ironic since speech is produced as a continuous sound. This special way of perceiving speech became known as ‘Categorical Perception’ and it was implied that only a speech-specific module could underlie categorical perception.

‘Duplex Perception,’ discovered by Timothy Rand in 1974, is another phenomenon thought to be special to perceiving speech. It occurs when listening through headphones. The first part of a syllable is played in one ear and the rest is presented to the other. This second part is perceived as continuation of the first part of the sound. But in the other ear it sounds like a simultaneous, nonspeech ‘chirp.’ How can we perceive the same sound as speech and nonspeech at the same time? It was thought that there must be both a speech module and an otherwise general module.

A similar phenomenon centers on what is called sine wave speech (SWS). SWS is artificially modified from real speech, consisting simply of pure tones varying in frequency – it contains none of what were thought to be the acoustic cues thought to be central to speech perception. This was a field-changing discovery by Roger Remez et al., in 1981. Incredibly, SWS shows that there are zero acoustic cues for predicting what will be perceived as speech other than sine wave tones compatible with resonances of the vocal tract (i.e. the motor articulations that created them).

More evidence supporting a speech-specific, motor-integrated mechanism comes from multisensory research – the famed ‘McGurk’ illusion. Try it out yourself. Play the video at this link and you’ll hear the syllable “ba”. Now close your eyes and play it a few times. How could it sound different but be the same?

From a motor-theory perspective, what you hear is influenced by witnessing a different motor articulation than what produced the sound – that of /ga/. The idea is that your brain resolves the incompatibility by guessing it was neither /ba/ or /ga/ and that it must have been /da/, a possible misperception of both audio and visual inputs. The key is that the illusion only happens when you witness the motor activity at the same time, implying a direct link between seeing vocal gestures and speech perception.

What’s more is that it’s possible to change what people hear by externally moving their face muscles. This seemingly unlikely finding was made by Takayuki Ito et al. in 2010. Participants were physically connected to a device that could move their lips. While they witnessed videos of a person saying words, the device was activated and their lips moved. Amazingly, they misheard the words audiovisually presented to them in a systematic manner, more compatible with their own artificially created lip movement. Thus, their own speech-related gestures influenced the words they heard.


Figure shows setup of Ito et al., 2010

These studies comprise some of the strangest, most intriguing and most influential support for the existence of speech-specific mechanism in our brains. They are not however, the only studies. For further reading, I suggest reviews by Galantucci et al. (2006), and Carbonell and Lotto (2014).

Speech perception: Not so special

The intrigue raised by the proposition of a special speech module is as phenomenal as evidence raised to support it. As with all other lofty claims, however, it did not proceed without detractors.

One of the first major blows to the ‘speech is special’ theory was accomplished by a team of four chinchillas. Patricia Kuhl and James D. Miller (1975) were able to train them to distinguish a /d/ sound from a /t/ sound. The acoustic properties between these sounds vary continuously, yet we normally perceive them categorically, consistently as a “t” or as a “d” in their normal speech context. The unlikelihood of chinchillas having categorical perception is underscored in a statement by Liberman himself: “Unfortunately, nothing is known about the way that non-human animals perceive speech… however, we should suppose that lacking the speech-sound decoder, animals would not perceive speech as we do, even at the phonetic level.” Unfortunately for the uniquely human “speech-sound decoder” idea, the chinchillas did perceive the /t/ and /d/ sounds categorically. The findings beg the question of why chinchillas would have such a speech decoding module if they can’t talk?


Yet, it remained, that categorical perception of sound was a property associated with speech until 2005. That year, however, Travis Wade and Lori Holt demonstrated that categorical perception also occurs with nonspeech sounds. Cleverly, they devised a unique video-game context in which sounds helped gamers to identify targets in a maze. When these targets appeared, they were presented with the sounds. As the game progressed, the targets became less visible and were incidentally aided by the sounds in their identification. After playing the game, participants completed a categorization task involving the sounds they heard. Amazingly the results were consistent with Categorical Perception. Moreover, the categories were learned incidentally because the participants didn’t need them to play the game and were not instructed to pay attention to them. In sum, a learned categorical perception of nonspeech sounds was not good news for advocates of an inborn, speech specific brain module.

What about going beyond acoustic perception and directly testing the link between observation of action and the perception of sound, as in the McGurk illusion? Can seeing human motor action influence our perception of nonspeech sounds? It turns out that yes, it can. In 1993, Helena Saldaña and Lawrence Rosenblum used videos of cellos being either plucked or bowed to see if watching these actions could influence judgments of plucking and bowing sounds. Indeed, they found that seeing a cello plucked was more likely to result in the auditory perception of plucking rather than bowing, and vice versa – a finding which advocates a generalized multisensory theory of perception, not solely tied to speech.

In 2010 Joseph Stevens and Lori Holt revisited Liberman’s attempt to essentially convert speech from one modality to another. They invented an auditory-to-visual speech reader – in some respect, the opposite of Alvin Liberman’s visual text-to-speech reader. Steven and Holt’s ‘robot’ turned speech sounds into coded changes in dials and lighted bars. What they found was that with practice, the visual signals could be used to enhance the intelligibility of speech in constant obstructing noise. This finding demonstrates that arbitrary visual cues can be used to influence the perception of speech and that such influence is not limited to witnessing a speaker’s vocal tract movements, as in the McGurk illusion.

These and a host of other experiments cracked the foundations of the Motor Theory and the notion that speech perception is accomplished only with an innate, dedicated speech module in the brain. What’s important is that these studies demonstrated in unique ways that the basic properties used to implicate such a mechanism have been found with animals that don’t talk, they can be learned and that they are present in the perception of nonspeech sounds. It seems that only a ‘general mechanism’ can explain these findings.

The rebirth of ‘Speech is Special?’

In a study published last year in Nature Neuroscience, Tobias Overath and David Poeppel used functional Magnetic Resonance imaging to isolate a neural substrate that responds selectively to the temporal attributes of speech. Arguably, this finding brings the quest to find a specialized speech mechanism full circle. It provides some answer to Liberman’s original inquiry as to why we can perceive speech at the rate it occurs, but not text converted into synthetic sounds played at the slowest practical rate.

Overath and Poeppel took the rapidity of acoustic speech into consideration, setting out to determine the existence of brain areas tuned to the temporal structure of natural speech. To do this, they created what they called “sound quilts.” These stimuli consisted of recorded speech segments in a foreign language that were broken up into smaller ‘patches,’ which were then reshuffled to make a new segment. Here, they carefully eliminated the abruptness that would otherwise occur at the end of one patch and the beginning of another. The more the stimuli were quilted in this way, the more disruptive they were to regular temporal patterning of speech. Remarkably, they found a part of the Superior Temporal Sulcus (STS) that showed responses that varied with the amount of quilting in the stimuli. This finding implies that these areas are ‘tuned’ to the temporal properties of speech in that there were greater responses to less quilted, less “patchy”, more natural speech.

This original experiment was followed by an onslaught of control conditions. Importantly, they ruled out alternative explanations based on acoustic properties such as pitch and amplitude. In a final experiment they produced quilts using nonspeech sounds consisting of, for example, bird song and footsteps. These nonspeech stimuli did not elicit the same, unique response observed with speech quilts. Ultimately their evidence for speech specific neural substrate is considerable.

What’s next?

Although Alvin Liberman passed on in the year 2000, it’s likely that he would be amazed upon realizing the tumult of his original ideas in the greater part of the last two decades. What holds for the future? Will the Overath and Poeppel findings be replicated? Will brain imaging produce further evidence? Regardless, in relation to Liberman’s Motor theory, investigations surrounding the Motor Theory of Speech indeed reveal remarkable connections between our perceptions of auditory and visual information and our implicit knowledge of how to generate the actions that creates the sounds of speech.

Further Reading:

Carbonell, K. M., & Lotto, A. J. (2014). Speech is not special… again. Frontiers in Psychology5, 427.

Galantucci, B., Fowler, C. A., & Turvey, M. T. (2006). The motor theory of speech perception reviewed. Psychonomic Bulletin & Review, 13(3), 361–377.

Ito, T., Tiede, M., and Ostry, D. J. (2009). Somatosensory function in speech perception. Proc. Natl. Acad. Sci. U.S.A. 106, 1245–1248. doi: 10.1073/pnas.0810063106

Kuhl P., Miller J. (1975) Speech perception by the chinchilla: Voiced-voiceless distinction in alveolar plosive consonants. Science. 190(4209), 69–72.

Liberman, A. M., Delattre, P., and Cooper, F. S. (1952). The role of selected stimulus-variables in the perception of the unvoiced stop consonants. Am. J. Psychol. 65, 497–516. doi: 10.2307/1418032

Liberman, A. M., Harris, K. S., Hoffman, H. S., and Griffith, B. C. (1957). The discrimination of speech sounds within and across phoneme boundaries. J. Exp. Psychol. 54, 358. doi: 10.1037/h0044417

McGurk, H., and MacDonald, J. (1976). Hearing lips and seeing voices. Nature 264, 746–748. doi: 10.1038/264746a0

Overath, T., McDermott, J. H., Zarate, J. M., & Poeppel, D. (2015). The cortical analysis of speech-specific temporal structure revealed by responses to sound quilts. Nature Neuroscience, 18(6), 903–911.

Rand, T. C. (1974). “Letter: Dichotic release from masking for speech”. The Journal of the Acoustical Society of America. 55 (3): 678–680. doi:10.1121/1.1914584.

Remez, R.E., Rubin, P.E., Pisoni, D.B., & Carrell, T.D. Speech perception without traditional speech cues. Science, 1981, 212, 947-950.

Saldaña, H. M., and Rosenblum, L. D. (1993). Visual influences on auditory pluck and bow judgments. Percept. Psychophys. 54, 406–416.

Stephens, J. D. W., and Holt, L. L. (2010). Learning novel artificial visual cues for use in speech identification. J. Acoust. Soc. Am. 128, 2138–2149. doi: 10.1121/1.3479537

Wade, T., and Holt, L. L. (2005). Incidental categorization of spectrally complex non-invariant auditory stimuli in a computer game task. J. Acoust. Soc. Am. 118, 2618–2633. doi: 10.1121/1.2011156


Sleepwalking and the Escape Adaptation


electroencephalogramSleepwalking, a phenomenon on the rise, is actually an intermediate state between sleep and wakefulness and can be dangerous to both one’s self and others. Considering neurophysiological findings, it may however, result from an adaptive preparation for escape.

Sleepwalking, formally known as somnambulism, is a form of parasomnia or arousal disorder, most common to children between the ages of four and twelve. It encompasses a range of active behaviour complex enough to be representative of a wakeful state, but with a lack of judgment and memory of the activity. In fact, there are some extremely remarkable cases of sleepwalking in adults that range all the way from harmless and amusing, to murderous. On the harmless side is an example of a woman who in her sleep, was “pissed off” because she can’t open a “tomato cage,” and a man who amazingly earns a significant amount of money from drawings he produces while asleep. Serious incidents include a teen who jumped out of a window 25 ft. above ground and another individual who froze to death in his sleep. Probably the most serious case is that of Kenneth Parks, who drove 23 km and then murdered his mother-in-law. Notably, sleepwalking as a criminal defense has become a growing topic in legal and ethical studies. This is largely because sleepwalking is a phenomenon on the rise, potentially as a side effect of increased use of sedatives, such as Ambien that have been argued to precondition parasomniac behaviour. Thankfully, however, most sleepwalking incidents are on the harmless side.

So what do we know about the cause of sleepwalking? Over millennia, sleepwalking was thought to result from everything from dryness of the brain, to vapors that stem from digestion to an imbalance in body humors. One common theory was that it was an enactment of dreams, most often occurring during REM sleep. Relatively recently however, Roger Broughton (1968) discovered that sleepwalking rises it’s ‘zombies’ out of Non-REM (NREM) sleep, specifically deep, slow wave sleep (SWS). Unlike REM sleep, which is characterized by low amplitude EEG waves varying in frequency, SWS is defined by high amplitude oscillations, low in frequency. These patterns are known as delta waves.

Sleepwalkers actually seem to have a subconscious eagerness to rise out of this deep sleep stage. Localized cortical areas that underlie motor activity are more likely become fully active, showing patterns of activity that are remarkably similar to those observed during a wakeful state. Meanwhile, frontal and parietal associative areas that interface one’s goals with incoming sensory information maintain or even increase their sleepy delta wave pattern. When the frontoparietal network is in such an inactive state, its usual role in inhibiting weird, non-goal directed behaviour becomes subdued. In fact, a direct, inhibitory link between frontoparieteal areas and the cingulate gyrus, an area central to complex motor activity, has been implicated as a prime sleepwalking root. It’s as if, during sleep, the motor area’s arousal threshold has been lowered compared to other parts of the brain. Why would that be?

Some think this motor-readiness was an adaptive feature that possibly helped survival. Sleeping, by definition is characterized by a loss of consciousness. However, this loss of consciousness is actually, surprisingly incomplete. In fact, using multiple imaging and recording techniques, it’s been demonstrated that auditory stimuli presented during sleep, can trigger activation in both the auditory cortex and the amygdala – a center directly related to the emotional ‘fight or flight’ response. What’s even more intriguing is that even while sleeping, our brains can selectively discriminate and process unfamiliar sounds (What made that noise??) and emotionally meaningful stimuli such as one’s own name. It seems we are tuned to process behaviourally relevant stimuli, even while we are fast asleep. This is probably not a coincidence, because novel and/or meaningful stimuli, could signal the presence of danger. Considering that we’ve had to avoid and escape potential predators for as long as we’ve been part of the animal kingdom, it’s been highly adaptive to evolve an arousal system that maximizes the benefits of sleep while minimizing the chance of being eaten. Potentially, sleepers without a low threshold for motor circuitry activation got eaten.

Sources/Further Reading

  1. Dissociated wake-like and sleep-like electro-cortical activity during sleep.
  1. Auditory processing across the sleep-wake cycle: simultaneous EEG and fMRI monitoring in humans.
  1. Disruption of hierarchical predictive coding during sleep.
  1. Neural Dynamics of Emotional Salience Processing in Response to Voices during the Stages of Sleep.
  1. Progression to deep sleep is characterized by changes to BOLD dynamics in sensory cortices.
  1. Arousal modulates auditory attention and awareness: insights from sleep, sedation, and disorders of consciousness.
  1. Acoustic oddball during NREM sleep: a combined EEG/fMRI study.
  1. Subconscious Stimulus Recognition and Processing During Sleep.
  1. Darwin’s Predisposition and the Restlessness that Drives Sleepwalking.
  1. Neural Markers of Responsiveness to the Environment in Human Sleep.
  1. Preferential processing of emotionally and self-relevant stimuli persists in unconscious N2 sleep.
  1. Disorders of Arousal.
  1. Somnambulism: clinical aspects and pathophysiological hypotheses.
  1. Parasomnias: an updated review.
  1. Sleepwalking episodes are preceded by arousal-related activation in the cingulate motor area: EEG current density imaging.

What Distinguishes Us as Human?


It’s commonly agreed that what sets us apart from other animals is our linguistic ability. However, musical ability is also unique to humans. Common to both language and music is the capacity to unconsciously combine conceptual elements in an infinite number of ways.

Toolmaking and cognitive evolution

Chipping a piece of flint to make a hand axe seems easy enough. If, 1.76 million years ago, austrolipithicenes could do it, then surely now anyone could. But although early stone tools were simple, their craft became increasingly complex and it took a considerable amount of skill to make them.

As discovered by Dietrich Stout, learning to make a stone hand-axe can take up to 300 hours – a practice called ‘knapping.’ It requires multiple stages of planning, several tools, fine motor coordination and inhibitory control. Monitoring of the knapping learning curve using multiple brain imaging techniques shows that acquiring advanced knapping skills produces changes in the brain implicated in greater cognitive control. Crucially, these studies imply that over evolutionary time, our brains changed in a manner to fit the commendable skills required for making complex stone tools. It is now well recognized that the evolution of the motor system in primates underlies great advancement in cognitive capacity.

What happens in the brain while an advanced knapper knaps? Imaging studies reveal activity in Broca’s area – part of the inferior frontal gyrus (IFG). Related primate studies have shown that within area F5, which is considered the ‘monkey Broca’s area’, are the mysterious mirror neurons that are active during both the observation and execution of goal-directed hand movements. Theories regarding the function of mirror neurons are numerous and far-reaching. One of the most plausible claims, however, is that they facilitate learning through imitation. However, the fine motor coordination required for making advanced stone tools are thought to be too complex to be routinely learned solely by ‘hominin see – hominin do’. Instead, this ability required complex representations of sequences of goal-oriented motoric action – a notion first implied by archaeologist Andrė Leroi-Gourhan in 1964. To some, this sequencing has been termed a ‘grammar of action.’ This representational aspect facilitates a critical cognitive leap, beyond simple mimicry.

Studies of the IFG clearly show a surprisingly vast functional overlap in linguistic and praxis (detailed motor planning and execution). To quote Stout:

“… it is now recognized that frontal ‘language relevant’ cortex extends across the entire inferior frontal gyrus (IFG) and contributes to a diverse range of linguistic functions… Furthermore, IFG is known to participate in a range of non-linguistic behaviours from object manipulation to sequence prediction… It has been proposed that this superficial behavioural diversity stems from an underlying computational role of IFG in the supramodal processing of hierarchically structured information.”

The ability to process hierarchically structured information is key here. It could have been crucial for stone tool making, language and ultimately for survival. It also underlies another uniquely human ability – musicality.

Language and music

Charles Darwin thought that musical enjoyment was something common to many species: “The perception, if not the enjoyment, of musical cadences [i.e., melodies] and of rhythm is probably common to all animals, and no doubt depends on the common physiological nature of their nervous systems.” But is this correct? Do other animals engage in and react to music in the same way we do?

The focus of music is often on this emotional quality. Indeed, music moves us in more than just physical ways and provokes intense reactions. Where does the emotional charge of music come from? Musical enjoyment is tied to the release of dopamine. Similarly, dopamine release underlies addictive behaviours – the intense feelings of anticipation and reward that can happen when playing a slot machine and winning, for example. Music we like often contains climactic points that we unconsciously predict because we have previously identified these musical patterns. When our predictions about these climactic points turn out to be correct we find it satisfying in the same physiological manner.

Considering our faculty for such pattern identification from a broader perspective, unconsciously identifying patterns in sound is a feat common to both musical and linguistic abilities. Specifically, both music and language are organized syntactically – they are arranged to according to a complex, rule-based pattern that we amazingly learn and practice unconsciously. Syntax is the faculty that enables us to combine discrete structural elements into larger sequences, using a shared rule-based system. When music and language don’t follow these rules they become noisy or nonsensical. In fact, studies involving a number of brain imaging and neural activity recording techniques consistently show similar responses (some indistinguishable) to syntactic violations in both music and language.

The shared functional and anatomic resources of linguistic and musical processing are, in fact, central to explanations of generalized improved learning in students with musical training. Musical training and aptitude has in fact, been commonly used as a significant predictor of academic success. Linguistic and musical activities are thought of as tools for increasing prefrontal cortex development and thus promote increases in cognitive capacity.

A broader sense of us

Saying that humans are defined by exceptional musical and linguistic capability is a rather clumsy and narrow way to state our uniqueness. Importantly, what underlies both these abilities is the capacity to structure conceptual elements into an infinite number of complex arrangements. Thus, as best as I can put it, what sets us apart from other animals is our infinitely generative syntactic capacity. Applied solely to linguistics, this feat has been proposed as something called ‘Merge.’ Perhaps a broader term that encompasses music, awaits us.

Auditory Pareidolia Does Not Exist

The phenomenon of Pareidolia consists of perceiving patterns or objects where there are none. Usually regarded as a visual experience, examples of similar auditory phenomena abound. Yet, a clear definition of auditory pareidolia remains elusive. Once an exact definition is obtained, auditory pareidolia could serve as a useful tool for investigating the brain.

Pareidolia – What is it?

You might not recognize the term, but you’ve probably experienced Pareidolia or have seen examples in pictures: The eerie face on mars, religious figures on toast… etc. Pareidolia is defined as perceiving meaningful patterns or objects in non-patterned or amorphous stimuli (Schott, 2013 also see this post). The phenomenon is an example of something called Apophenia; the tendency to perceive patterns where none exist, often associated with delusions and schizophrenia. Despite this relation, an experience of pareidolia does not imply pathology. In fact, some regard the earliest evidence for abstract symbolic thought in hominids to be a pebble that to an individual australopithecine, appeared as a striking impression of a face and was subsequently brought home, around 3,000,000 BP. Another early example includes the grouping of stars, commonly known as constellations.

Seeing a real face informs us of the presence of another animal (human or nonhuman) around us – another animal that can potentially harm or help us in important ways. In fact, it has been shown that we are ‘tuned’ to notice such stimuli. Pareidolia is not, however, limited to the appearance of faces. Take, for example, the Rorschach inkblot test used in psychoanalysis to uncover subconscious thought patterns. It is simply the experience of perceiving something meaningful or structured within something that is not structured and it comes from the natural human tendency to find meaningful patterns in our surroundings. It’s an act of making our environment intelligible – an adaptation obviously very useful for survival.

Auditory Pareidolia

Pareidolia is often described as existing in auditory form and there are many interesting examples. One type includes the wondrous ‘phantom voices’ created by psychoacoustic researcher, Diana Deutsch. These fascinating stimuli consist of two spoken words or a single word broken up into two syllables. One word or syllable is played in one stereo channel while the other word is put through the other channel and the process is alternated across channels at a high rate. Initially the components are not recognizable as words but rather sound as though they’ve emerged from some sort of linguistic ‘uncanny valley.’ As this process repeats, entirely different, new words appear. Even more interesting is that the words heard are often related to the current psychological state of the perceiver – “hungry,””happy,””lonely” – in a manner similar to the internal states thought to be uncovered by the visual Rhorshach test.

Phantom Words 1

Phantom Words 2

Phantom Words 3

Another common example derives from the realm of popular paranormal circles and is known as Electronic Voice Phenomena (EVP). Popularized by parapsychologist Konstantīns Raudive in the 70s, EVP consists of spooky voice-like recordings found in mostly radio static but also in other electronically based recordings. They occur when random noise gives the impression of a voice and are thought by believers to emanate from the spirit world. A canonical set of examples includes a collection of relatively early recordings, compiled as Ghost Orchid.

Defining the Phenomena

Scientific exploration, where possible, transits phenomena to matters that are well understood and defined. As both a dictionary and a credible, peer-reviewed publication clearly defining auditory pareidolia evades an extensive search, auditory pareidolia remains in the land of phenomenology. One can look to a number of other sources, with varying amounts of credibility. For amusement, here’s a list of potential definitions (with sources) and the difficulties, I see, associated with each one:

  1. Tweeter who shall remain anonymous:

“Auditory pareidolia is a situation created when the brain incorrectly interprets random patterns as being familiar patterns.”

Well, it has to be auditory. And, what is defined as a familiar pattern anyway? Does misperceiving a non-voice as a voice qualify as interpreting a pattern? Also, there is no such thing as a “random pattern”.

  1. An online paranormal magazine creator:

“It’s when your mind is desperately trying to grasp words out of sounds you’re hearing. Listen to a song in another language and your mind will interpret some words in your own language.”

So the phenomenon is restricted to perceiving words? Moreover any actual conscious perception is not necessary?

  1. Dictionary of Hallucinations 

“…a cognitive illusion consisting of intelligible and meaningful words discerned in a pattern of unintelligible words, random sounds, or white noise.”

This definition also implies that the phenomenon is restricted to hearing words.

  1. New England Skeptical Society 

“Audio pareidolia is hearing words in sound that are not actually there. This can occur by misinterpreting words that are being said, or by hearing words in random noise. The phenomenon is the same as with visual pareidolia, in that the brain is searching for a recognized pattern, finds the closest match, and then processes the incoming sensory information to enhance the apparent match. Here is an example (sent in by Peter Davis) – a Youtube video of what looks like a church group singing a song. Below are subtitles suggesting what they are saying – and this is sufficient suggestion to force a match between what you are hearing and the words in the subtitles.”

This one defines visual pareidolia in a broad sense and indicates it is basically an auditory mode of the visual phenomenon, but says that it is restricted to hearing words. The example used strongly constrains the effect by the use of subtitles. Does the fact that our specific perceptions can be influenced by being primed change the definition of the phenomenon? Priming is also common in outlandish presentations of EVP, as recorders strive to increase both the spectacle and believability in their recordings.

  1. Hearing Loss Help

“Audio pareidolia is hearing words/music that are not actually in the sounds you are hearing. This can occur by misinterpreting words that are being said, or by hearing words in random noise. In audio pareidolia, your brain searches for a recognized pattern, finds the closest match, and then processes the incoming sensory information to enhance the apparent match.”

Not sure exactly what’s meant in the last part of this definition, but let’s just leave that alone. According to this definition, hearing music in random noise also qualifies, but nothing else.

  1. Polysyllabic (blog):

“Pareidolia refers to the human tendency to attribute meaning to random stimuli. Usually, when people talk about pareidolia they mean visual phenomena like Virgin Marys appearing on pieces of toast and the like. But pareidolia can be auditory as well.”

This attempt falls flat before anything to do with auditory pareidolia begins.

  1. Wikipedia

“Pareidolia is a psychological phenomenon involving a stimulus (an image or a sound) wherein the mind perceives a familiar pattern of something where none actually exists.”

The vastness and popularity of Wikipedia underscores the importance of it having an accurate definiton. But here, why would the pattern have to be a familiar one? Can’t a pattern just be a pattern? Moreover, this definition would exclude common examples, such as the phantom word stimuli created by Diana Deutsch. These stimuli do have a pattern to them, it’s just that a different pattern emerges perceptually.”

  1. Nees & Phillips, 2015

“…In a pilot experiment to test the ability of listeners to detect the presence of anomalous signals in audified noise files, the researchers observed an unexpected number of illusory tonal signals in control files of white noise. Further studies replicated the effect and showed that unprimed, naïve listeners reported illusory mechanical noises, natural noises, tones, and human voices in white noise files.”

Ok, here we have a peer-reviewed publication that directly  explores the phenomenon. Great, but they never actually provide a definition of auditory pareidolia!

A Possible Definition?

Auditory Pareidolia: a phenomenon that occurs when a naïve, unprimed listener perceives sounds that are not present in any part, and bear no similarity in meaning, to the actual stimulus.

This definition, although likely prone to its own difficulties, would include the illusory perception of any sound in either random or patterned stimuli, not just words. It would exclude instances where perceivers are primed with, for example, subtitles.

More importantly, once the phenomenon has an agreed-upon definition, it could possibly be of use. Perhaps it could serve in basic science research to investigate neural correlates of perception. It could also have potential as a clinical tool. In a study published in PLOS One this past May, Mamiya et al. reveal their newest efforts to devise a test for visual pareidolia. They define it as a surrogate for visual hallucination and imply their tests as clinically helpful. Auditory hallucinations are often more common than visual hallucinations in Schizophrenia, a disorder that is as widespread as it is serious. Thus a similar effort may be worthwhile.



The Magnetic Touch Illusion: Perceiving a “Force field” in an Alien Limb


A new study demonstrates an illusion where participants experience a strange sensation of illusory ‘magnetic’ interaction between a tool and their hand. However it’s not actually their hand.

Ownership of your limbs is something generally not questioned. If our time were consumed by uncertainty of our limbs, it would be highly debilitating. However, one can be ‘tricked’ into adopting a rubber hand for their own – a feat that multisensory researchers have been investigating since the dawn of the ‘rubber hand illusion.’ For those not familiar, this illusion can occur when one views a rubber hand while their actual hand is placed out of view. When an experimenter brushes both the rubber hand and the participant’s hand simultaneously with a paintbrush, the participant ‘feels’ the tactile sensation as though it’s coming from the rubber hand and not their own hand. The participant ‘feels’ ownership of the rubber hand in the very same way as their own hand. The illusion has been studied extensively, revealing insight primarily regarding multisensory integration and embodiment.

In the October issue of Cognition, Arvid Guterstam (Erhsson Lab) and colleagues have published an intriguing examination of an extension of the rubber hand illusion; The Magnetic Touch Illusion. This arrangement, originally conceived of by Jakob Hohwy and Brian Paton, procures a qualitatively different sensation involving the feeling of a “force field” around an alien limb, for which ownership is similarly claimed.

Although the study involves eight experiments, the illusion results from a simple manipulation of the basic rubber hand illusion setup. While the experimenter brushes the participant’s hidden fingers, s/he brushes the space around the rubber hand (instead of brushing the rubber hand itself), clearly making it visible that s/he is not actually making contact with the rubber hand. According to the study, this variation is unique because it produces a sensation of multisensory visual-tactile integration within peripersonal space. Participants described this feeling as a ‘‘repelling magnetic force”, a ‘‘force field”, or ‘‘invisible rays of touch.”

From the series of experiments the authors were able to demonstrate that the illusion persisted with brush strokes around the rubber hand up to a distance of about 40 cm and that the illusion was spatially anchored to the rubber hand – it followed the rubber hand when the hand was moved. It was also concluded that the integration of visual and tactile information was crucial – simply moving the brush closer to the rubber hand in a manner that implies an expected contact did not elicit the effect. In addition, they showed visible solid barrier around the rubber hand similarly prevents the illusion from occurring.

From an experiential perspective, one can imagine the magnetic touch illusion severely blurs the boundary between self and other that’s normally perceived very clearly and effortlessly. However, from a knowledge-based perspective, the authors state “The present findings offer an important advancement in understanding of the relationship between the representation of peripersonal space and the sense of body ownership—two processes related to the construction of a multisensory boundary separating the body from the external environment.”

The ‘Science’ of Muzak at Work

muzak dial

The El Pinto Chicken Farm in Albuquerque’s North valley uses cutting-edge techniques to maximize quality. Apparently they harvest the finest, eggiest eggs that you can taste. Why so good? As claimed by El Pinto, it’s partly because of the Beethoven compositions played for their beloved chickens.

Yes, this is a marketing ploy but it is well-known that music has the power to influence in various ways. When skyscrapers were first built, music was played in elevators to ease the nerves of inexperienced passengers during an otherwise anxious high-rise transport – hence ‘Elevator music’. Music played for patients during and around the time of their surgery has been shown to reduce anxiety.

But can music maximize productivity? According to Muzak corporation’, to whom we owe ‘elevator music’, the answer was a definite “Yes” and the research put towards achieving this effect was remarkable, if not dubious.

During the Second World War, factory work supporting the war-effort was an uneasy undertaking. Soon it was found through surveys, that workers appreciated music being played while they worked. After the war, in the continuing spirit of Taylorism, Muzak sought to systematically harness the potential in music to increase productivity during monotonous, repetitive tasks. Firstly, vocals were taken out of their factory music because they were deemed distracting. The idea was to produce a ‘functional’ music that was valued for its ignorability and it’s ability to subconsciously increase worker output. It was said to be “music that is heard but not listened to.”

Their efforts were initially based on the studies of Charles Disirens in the 1930’s. Diserens is noted for, for example, analyzing typists’ output collected after music played for participants in the background. He concluded that music could regulate movements through rhythm, increase worker attention, and instill ‘value’ through music’s direct physiological properties. Equally influential was the Yerkes-Dodson law (1908), stating that performance is maximally increased with only a moderate increase in arousal. It was thought that Muzak would be the perfect stimulant.

Armed with such knowledge, Muzak’s ‘scientific’ consultants Harold Burris-Meyer and Richard Cardinell, faculty from the Stevens Institute of Technology in Hoboken, were placed in charge of developing what was called ‘Stimulus Progression’; Mazak’s hallmark ‘scientific’ development. Stimulus progression consisted of segments of music that gradually increased in stimulating effect, reducing worker fatigue during typical morning and afternoon slumps. This was achieved using a unique formula, involving the quantification of ‘mood’ as affected by a number of musical variables.

As a starting point, worker ‘fatigue curves’ were established according to time of day. As described by Cardinell in 1946:

“The first consideration is to determine the variation of employee fatigue… Experimentation in the past has proven that the most effective method of obtaining the required stimulus is by using constant progression of musical brightness throughout the group of selections; brightness may be obtained from several roots: tempo, instrumentation, musical arrangement, melodic line or incidental rhythm. All these factors contribute to the mood of the selection.”

offices fatigue curve

Stimulus progression was applied to music programming throughout the entire workday in a manner that mirrored the fatigue curves . However, a “group of selections” only lasted approximately 15 minutes, the songs themselves lasting between 2 and 3.5 minutes each. These groups of songs were followed by 15 or 30-minute periods of silence, as it was found that regardless of changes in “brightness,” constant music contributed to the monotony of the work, decreasing productivity. Within these 15-minute groups values associated with the different root factors (e.g. rhythm) were varied from song to song, but increased on average. This variation was also instrumental in preventing monotony caused by the music itself.

The key element was tempo – the pace of the music measured in beats per minute. Tempo was varied between 40 and 130 bpm. The change in tempo in the muzak songs was centered on 72 bpm, around the average human heart rate. Gradually increasing the overall tempo from the tempo of the first song in the selection was thought to be essential for affecting a worker’s perception of movement through time.


Rhythm, closely related to tempo, was also varied systematically during a selection of songs. Rhythm is defined by the form of temporal movement, created by the timing between accented and non-accented beats. These timings included sambas (2/2), foxtrots (3/4), quick-steps (6/8) and waltzes (4/4). As with tempo, there would be an overall increase in the tempo across a selection of tracks.


Instrumentation was another key element with ‘stimulus’ values corresponding to types of instruments that dominated the songs. Strings were considered to be softest. Woodwinds, saxophones and oboes filled the middle ground and brass instrumentation being the hardest and most stimulating (heavy percussion in Muzak was a nish nish – again, too distracting). The ratings were specifically tied to the timbre of the instruments – their unique tonal colour or quality defined by the physical aspects of the instrument that produce the sound.

As stated by muzak programmers:

“…the strings produce a more colorful quality of sound. Most dominating and emotionally exciting are the brasses: the trumpets and trombones. In conjunction with the strings and woodwinds, the quality of sound is rich and full-bodied. By eliminating the soft-strings the remaining woodwinds and brasses produce an even more exciting and stimulating sound. (MUZAK, 1956:8)…”


Finally, the last component in the stimulus progression formula was orchestra size. This was the most difficult factor because of qualitative differences in the sound of one or more of the same instrument, the instrumentation and the composition of the piece.

orchestra size

Below is a composite of all these elements for an actual selection of songs. Notice the variation in each but a linearly increasing ‘stimulus progression’ throughout the segment. While most are likely to agree that overall, this clever methodology might indeed have achieved some results, the ‘mood’ curve is not in the realm of anything that was measured directly. The only research to consult is Muzak’s own and one can guess their conclusions.

composite muzak

Ironically, progressing variation was the foundation to muzak’s stimulus progression yet too much variation in any respect of these musical attributes was thought to make it distracting and ultimately detrimental to the company’s bottom line. Thus, listeners were left with some of the prettiest, but blandest, most artistically compromised and unemotional music ever made. Indeed, the idea of creating ‘functional music’ as a commodity, piped into places of monotonous work has been considered by some as pivotal move in the subversion of culture and meaning in society. They note that previously, songs were commonly sung by workers as a meaningful activity that made time appear to pass faster, work seem easier and camaraderie grow. Muzak they argue was the ultimate commodification of music; an invisible product used by corporations as a form of “sonic surveillance” and a tool for unconsciously manipulating people. It is argued to have been an ever-present symbol of authority and control, permeating the space around workers while they completed their tasks, in guise of making work more pleasant. James Keenen, Ph.D., once Chairman of Muzak’s Board of Scientific Advisors claims that “Muzak promotes the sharing of meaning because it massifies symbolism in which not few but all can participate.” If this was true, why isn’t anyone participating in Muzak in present times? Did anyone ever participate in Muzak? Although singing work songs is a thing of the past, thankfully we can often listen to our own music if we wish to.

(Images obtained from Jerri Ann Husch’s PhD thesis on muzak and culture. They are referenced to “MUZAK, 1965” and are permissible under the Fair Use Index).

knots-aliens-design-texture_default‘Merging’ the Origins of Language and the Search for Intelligent Life

Seemingly, the worlds of linguistics and astrobiology are on opposite sides of the universe. However, recent work on the origin of language may inform Drake’s famous equation regarding the possibility of intelligent life on other planets, drawing a ‘cosmilinguistic’ connection.

55 years ago, at a small conference in Green Bank, West Virginia, astrophysicist Frank Drake proposed a probabilistic argument for deriving the likelihood of extraterrestrial intelligent life. One of his primary goals in this endeavor was simply to stimulate discussion on how to scientifically approach this subject, taking it from the realm of the fringe, to a respected scientific endeavor. After all, this was the first SETI meeting. The equation is formulated as:

N = R* x fp  x  n x  f x  f x  f x  L

Frank Drake describes the variables basically in order from most to least probable:

N = The number of civilizations in The Milky Way Galaxy whose
electromagnetic emissions are detectable.

R* = The rate of formation of stars suitable for the development of intelligent life.
fp = The fraction of those stars with planetary systems.
ne = The number of planets, per solar system, with an environment suitable for life.
fl = The fraction of suitable planets on which life actually appears.
fi = The fraction of life bearing planets on which intelligent life emerges.
fc = The fraction of civilizations that develop a technology that releases detectable signs of their existence into space.
L = The length of time such civilizations release detectable signals into

Since the equation’s inception, SETI has become the world’s leading organization on the search for extraterrestrial intelligence and it’s growth is evinced by it’s latest branch, the 100 million dollar Breakthrough Initiatives program, commandeered by Stephen Hawking, Russian billionaire Yuri Milner and to perhaps some chagrin, Mark Zuckerberg.

Important milestones on improving the accuracy of the Drake equation include the discovery of exoplanets – planets that orbit stars other than the sun –, which are currently known to number over 3000. Going beyond this however, it is now estimated that 20 to 25 percent of exoplanets are in habitable zones – zones where life could occur – according to physicist, astronomer and writer, Adam Frank in his latest NY Times piece.

These figures are especially great news for astrobiologists, who are mostly focused on discovering cellular life and not ‘intelligent life.’ Indeed, the life housed by the Earth has been no more than single cell organisms for about 75% of its existence. If the SETI program is to succeed, it’s going to be the result of contact with an advanced form of life.

A common hypothesis is that advanced life on another planet may be the result of an evolutionary process, just as on Earth. Proponents of this idea include Richard Dawkins, who has pointed out that life as we know it, has followed predictable paths of evolution and that the same predictable paths might be expected on other planets (although intelligent alien life might not be carbon-based and may not resemble anything like us). Indeed, there are examples on earth, of animals that have largely separate evolutionary paths but which share common traits. For example, wings are an adaptation for both birds and bats.

The Fi variable and the capacity for language

The probability of a rise to intelligence (defined as in at least human-level cognitive ability) is thought to be one of the least accurately estimated variables in the Drake equation. Although in 1961, the estimate was an optimistic 1, key evolutionary theorists such as Ernst Mayer have pointed out that on earth alone, there have been trillions of chances for species to evolve to human intelligence. And how many times has it happened? And so, the current consensus for the Fi variable is still anywhere between basically 0, and 1.

Perhaps it’s possible to consider the concept of intelligence more carefully. For example, when did we become intelligent? What defines our intelligence 4as being greater than other animals? One commonly held belief is that the capacity for language is a unique core component of human intelligence. In his lecture on Life in the Universe, Stephen Hawking states:

“…with the human race, evolution reached a critical stage, comparable in importance with the development of DNA. This was the development of language, and particularly written language. It meant that information can be passed on, from generation to generation, other than genetically, through DNA. There has been no detectable change in human DNA, brought about by biological evolution, in the ten thousand years of recorded history. But the amount of knowledge handed on from generation to generation has grown enormously.”

Theories of the origin of language are as interesting as they are diverse. The main reason for this is that evidence is basically impossible to find; language consists of a cognitive capacity and its history has not left behind physical evidence before its written forms. You can’t ‘dust for language.’

The capacity to Merge

A recently developed approach to considering the origin of language involves the Strong Minimalist Thesis (SMT). Its main proponent is Noam Chomsky. Chomsky deems that language is uniquely human capacity. Although animals share some aspects of language in their communication, it’s widely accepted that they do not possess its critical feature; a generative capacity defined by the ability to structure a finite number of linguistic elements into an infinite number of sentences. This ability is innate in humans and requires a hierarchical syntactic structure; the combining of linguistic elements according to a rule-based system. The cornerstone of SMT is a process called “Merge.” Put simply Merge is the capacity to ‘merge’ multiple individual “concepts” in a manner that is computationally minimal, and maximally efficient. Simply put, Merge allows us to combine, for example, {apples} and {the} into {the, apples} and then add that to {ate}, deriving {ate, {the, apples}}.

Central to Merge, is the idea of unordered combinations of linguistic expressions. Having no rules for ordering these components is crucial because adding an order rule would greatly stifle efficiency and computational minimalism. How can there be language with no linear order to its structure? Here, Chomsky refers to I-language, where “I” refers to “internal, intensional and individual.” This is a different mode than the external mode which we speak. Fascinatingly, this structuring of expressions is a way of managing thoughts even before they are internally articulated.

Indeed, Chomsky believes the “Basic Principle” of language is the structuring of thought and not external communication. In a paper from earlier this year, Chomsky states

“Note that these conclusions about language architecture undermine a conventional contemporary doctrine that language is primarily a system of communication, and presumably evolved from simpler communication systems. If, as the evidence strongly indicates, even externalization is an ancillary property of language, then specific uses of externalized language, as in communication, are an even more peripheral phenomenon – a conclusion also supported by other evidence, I think. Language appears to be primarily an instrument of thought, much in accord with the spirit of the tradition. There is no reason to suppose that it evolved as a system of communication.”

It becomes even more intriguing when one considers the origin of Merge. According to SMT, it arose as the result of a genetic change sourced a single individual. Yes, the idea is that a single person became endowed with the key cognitive architecture to propel humankind towards modern intelligence. It may sound outlandish but it corresponds well with events in history that show a wellspring of human development beginning around 100, 000 years ago, evidenced by the first objects that demonstrate the existence symbolic thought. What’s more is that Merge, in accordance with the minimalistic motif of SMT, required only minor genetic changes that would have been possible in a relatively short period of human revolution.

What would this individual have been like? S/he was uniquely capable of thought, planning, inference, reflection and so on, according to Chomsky. [I contacted Dr. Chomsky to see if he could expand on this. He replied (within 15 mins), briefly saying that this individual had some resemblance of thoughts that we can articulate, whereas others did not]. Critically, Merge could have given the individual, adaptive advantages that were passed on.

The ‘Cosmilinguistic’ Connection

The SMT has stirred debate amongst prominent linguists and currently there exists no clear neurobiological evidence for Merge (although much progress has been made). However, knowledge of mutation rates is becoming increasingly understood. For example, in 2009, a landmark study involving two distant relatives in China revealed the first direct estimate of the human mutation rate. A more recent development came last year from a collaboration between Harvard and MIT groups, involving a new, potentially more accurate method of estimation. Although we are surely a long way from possibly determining any candidate set of genes associated with SMT, determining the likelihood of a mutation resulting in such a set is within the realm of science – it is something measurable that can be accomplished by using methods that we currently have some grasp of. Importantly, this quantity could inform a much needed narrowing of the Fi variable in the Drake equation.

Does this suggest that aliens are speaking English? Certainly not, but even a vague estimate on the capacity for language helps to reveal the chances that they are able to articulate, store, record and possibly transmit information in an advanced manner. Transmitted information that either us or them can receive and interpret.


Continue reading

Taming the Beast: Keeping the new wave of Psychedelics research on the path to enlightenment



In recent years there have been countless headlines about new clinical research involving psychedelic drugs, trumpeting advances in a manner akin to: ‘New Study Shows Progress in LSD Research – Offers new potential for treatment of’ – a google search of “Recent LSD research” will provide a gratifying demonstration. Indeed, three new review articles about psychedelic drugs, place emphasis on milestone developments (1, 2, 3).

Some high profile media aggrandizement of the ‘powerful potential of psychedelics’ may indeed help to propel future studies, but what about the enormous Stigma that also surrounds psychedelic drugs? Can it simply be dismissed as “hysteria?” Is talk of mind control, bad trips, acid casualties and flashbacks all poppycock? Since it’s first semi-synthesis in 1947 by Albert Hoffman, LSD has always been thought of as having “powerful potential.” But without careful science, a new-wave Stigma may be brought about. Indeed, science by definition takes phenomena, studies them and by revealing their nature makes them more predictable. The effects of psychedelic drugs, are still defined as being largely unpredictable and potential danger remains lurking. Before going on to use them in a clinical setting and keeping our fingers crossed, a large dose of basic science could provide much needed insight. Insight that actualizes all this “potential” for therapeutic use.

Psychedelic drug research: At it’s worst

Interest in the potential of psychedelic drugs increased sharply after the oft-written about, accidental ingesting of LSD tartrate in 1943 that made for Albert Hoffman’s unforgettable “kaleidoscopic and fantastic” bicycle trip home from the Basel Sandoz laboratories in Switzerland. As outlined in a review from last month, it wasn’t long before LSD was semi-synthesized en masse for the production of Delysid – sugar-coated tablets for administration. It found itself largely in the hands of experimental psychiatrists, and psychologists treating depression, alcoholism, schizophrenia and autism. These experiments were reported to provide “therapeutically valuable insights into unconscious processes.” Between 1953 and 1973 there were a reported 116 studies involving psychedelic drugs, all funded by the US government. There existed an enthusiasm that smacks of familiarity with the current zeitgeist.

Soon after Hoffman’s trip, in 1951, the Central Intelligence Agency began it’s MKUltra program; the birth of the “problem child” of psychedelic research – a term used by Albert Hoffman himself. Alongside research with more ethical motivations the MKUltra program was officially concerned with mind control – behaviour modification and prisoner interrogation. Experiments that would be, by today’s standards, considered immoral and illegal were conducted at no less than 30 universities across North America. Historians commonly describe the project as having been aimed to create a ‘Manchurian Candidate;’ a captured enemy brainwashed into infiltrating their own government.

MKUltra was a top-secret cold-war era program targeted at  obtaining high-profile information and exploring the possibility of mental reprogramming on anyone considered to be a potential threat. Often, the first ‘candidates’ were actually unwitting citizens forced into participation (for an early and eye-opening documentary see this). Under the guise of procuring protection of US citizens from communist forces, the CIA initially administered LSD to those considered to be on the fringes of society – prostitutes, prisoners, patients with mental disorders and drug addicts – as they were thought to be the most susceptible to influence. Alongside these experiments, LSD and other psychedelic drugs were also given to doctors, CIA employees and those considered to be among the general public. Equally wrongful, the atrocities that happened in the dark realm of MKUltra were conducted at Universities by department authorities with no knowledge of the true purpose of the research. Indeed, some of the CIA funding was laundered through other sources directly connected with the institutions, while faculty PIs were completely in the dark as to the true motivation for the studies.

The research was at the time considered cutting-edge and remarkably some of it undertaken as having genuine potential for therapeutic benefit. The idea behind experimental therapies such ‘psychic driving’ and later, what was known as ‘depatterning therapy’, for example, was to take the mind of a patient and essentially wipe it clean through repeated sessions of LSD administration, electric shocks and lack of sleep. The aim was to make them forget the source of their illness and make their mind essentially a blank slate from which they could begin anew. They would then begin the ‘repatterning’ component, requiring them to listen to taped ‘therapeutic’ messages for hours, days, weeks and months upon end. Here’s one study, on patients with Schizophrenia conducted by Dr. Cameron involving depatterning. Evidently, these treatments seemed very promising. So why was the funding for LSD and other psychedelic experiments cut? Officially, it was because the effects of these drugs were found to be unpredictable, as stated by JS Earman, the CIA’s Inspector General at the time.

The effect of this illegal and immoral program of unwitting psychedelic drug administration was an incalculable science PR fallacy and an impetus for the backlash against LSD research that occurred in the late 60s and early 70s, making psychedelics research itself a true ‘acid casualty.’ Victims such as Velma Orlikow illustrate a typical example of consequences of the loosely constrained, paranoia-motivated program. She checked into Allen Memorial institute at McGill University in Montreal because of postnatal depression. Her long-term treatment under Dr. Donald Ewen Cameron included being administered LSD 14 times and then having listen exhaustively to repeated, taped conversations intended to be therapeutic. This torture was part of a developing technique called “psychic driving.” Orlikow did not recover. Rather, she spent the rest of her life suffering mental illness attributed to the therapy program. She is one of nine victims who in 1988 were given a portion of $750, 000 CAD by the Canadian Justice Department in an out-of-court settlement for the consequences of this type of manipulation.

Meanwhile, recreational LSD use was on the rise, becoming an important catalyst for and central part of the 6O’s counterculture movement. Baby boomers coming of age were exalting Timothy Leary towards an almost religious status. His Harvard academic cred symbolized the go-signal for “tuning in, turning on and dropping out”. Millions of Americans are reported to have consumed LSD. But where did this lead them? What enlightenment was achieved? How would society change? Well, it seemed the newly elected Nixon government wasn’t exactly the representation that the counterculture youth were hoping for, declaring “the war on drugs”. LSD became a Schedule 1 drug and the counterculture movement was left only with flashbacks to the better times from the past. Research on psychedelics as therapeutic agents on American and Canadian campuses came to a halt and the Stigma went from being a problem child to a full-fledged demon. Indeed, the sun had set on the ‘Age of Aquarius.’

The new realm of psychedelics research

The dust raised by what happened in the first twenty years of psychonautic exploration of hallucinogenic drugs appears to have settled. Research is now completely untethered to cold-war motivations and there is no ultra-secret program funding research to determine how to reach into the very core of the minds of individuals and assert complete control (interestingly, some labs have successfully turned to crowdfunding for financial support). Meta-research has identified important methodological shortcomings of the pioneering work, including the lack of basic control groups, follow-up measurements, substantial variation in dose and dosing, non-standardized criteria for therapeutic outcome and the use of fringe methods for determining outcomes of drug use, such as sensory isolation and sensory overload (Santos et al., 2016). Today’s studies with psychedelics are significantly more controlled and thereby safer. Participants remain under careful supervision in safe environments while undergoing the experience of psychedelics. Research dedicated towards a genuine exploration of psychedelics for therapeutic purposes and for basic understanding of brain function is currently underway and increasing. Centers specific to this research are now coming into existence, including the Multidisciplinary Association for Psychedelic Studies (MAPS), the Beckley Foundation and the Heffter Research Institute. Similarly, university labs primarily focused on psychedelic research are also more numerous.

But despite having over 50 years of research behind us, is there enough control and predictability tied to the use of these drugs? Can it be assured that in another 50 years we won’t look back on the current clinical trails to some degree, in the way we do of the early research? If that were the case, perhaps the current research would proceed without much attention. Instead, media coverage has reached respected realms including publications in Nature news, the New Yorker, New York Times and Nautilus. Moreover, LSD, Psylocibin and DMT are still Schedule 1 drugs and this fact still deserves all the careful consideration that it can employ.

A review published last month by Santos et al. covers much of the ground on which the last 25 years of work stands. It emphasizes six recent clinical trial studies involving LSD, Psylocibin and DMT. These six studies were singled out because they met the author’s criteria for being the most methodologically sound. Elgibility criteria included measures associated with peer-reviewed publication, structured diagnostics of anxiety, depressive or dependence disorders, placebo use, and validated scales for measuring changes in symptomology. Each study showed a reduction of symptoms associated with anxiety, depressive and dependence disorders. At a neurophysiological level, possible methods of action are discussed for each one.

The work sounds promising and shows much potential. This is cutting-edge research on the frontier of human knowledge regarding the brain and the mind – just as it has been since the late forties. Framed this way, it might be easier to understand the importance of keeping this research carefully controlled and not to dismiss what happened in the first 25 years as hysteria despite the change in political climate.

So what’s ‘the answer’? Part of it is likely to lie in improving the predictability of the effects on these drugs, especially LSD. In two of the previously mentioned reviews advocating the pharmacological use of psychedelics, hallucinogens remain described as largely “unpredictable” and “dependent on current psychological state and social environment.” This is the same consensus arrived at by the CIA in the 1960’s. Moreover, it’s a token anti-hallucinogen warning given on any standard drug abuse website, educational film, pamphlet or help line. Bad trips and flashbacks are simply not hysteria. Robin Carhart-Harris, head of Psychedelic research at Imperial College in London, acknowledges “…LSD has potential negative effects. Probably the crucial one is a bad trip. It’s not uncommon for people to have anxiety during a psychedelic drug experience…the experience can be nightmarish at times.” In regard to flashbacks, the phenomena is now officially termed: hallucinogen-persisting perception disorder and has been studied extensively.

The fact unpredictability remains the problem child of these drugs underscores the need for a more in-depth knowledge of their mechanism of action and long-term effects. It’s not as though there is no current insight into either of these and much progress has been made. For example many studies have shown the agonistic relationship between LSD and 5HT (serotonin) receptors and how this might promote some of the benefits observed. Carhart-Harris et al. published the first neuroimaging study of the LSD experience revealing it’s neural correlates – a landmark study that garnered much media attention. This study revealed vast changes in brain connectivity under the influence of LSD and makes important implications towards the neurobiological underpinnings of the LSD experience. In the past year alone his prolific lab has published no short of seven studies regarding the relationship between the experience of the drug and it’s effect on the brain. The most recent study (link) used this approach to investigate long-term changes in personality after an LSD use. These studies, investigating the more fundamental dynamics of psychedelic drugs go a long way towards reducing their unpredictability and thereby giving therapeutic research a much stronger foundation.

In conclusion, the new age of research is here and it’s growing, but perhaps a better understanding is needed on a more fundamental level before administering these drugs to patients with disorders. The risk of bad trips and flashbacks is still acknowledged, moreover the “unpredictability” factor still lurks like the ghosts of Timothy Leary and the MKUltra research program. Headway is being made in reducing it, but perhaps that could be a primary focus. Otherwise, as over the past 50 years, the research may only continue to proclaim it’s ‘potential’. Hopefully, in a time soon to come, we can see it’s use as standard.


First Flight

Welcome to Spirit // Matter! Here, you’ll find interesting content related to science, with a focus on topics related mainly to cognitive neuroscience and psychology. News of new studies will be dispatched, views and issues related to recent scientific frontiers will also be presented in a thought-provoking manner. So get that hand on the chin and those eyes slightly squinted!