HAMLET
Human, Animal, and Machine Learning: Experiment and Theory
Abstract: Much research in cognitive neuroscience attempts to localize cognitive functions to specific neuroanatomical regions. A common method for identifying such regions is to model the average neural responses evoked across participants by tasks that tap different cognitive processes. Such methods typically assume that neighboring voxels encode similar information, that anatomically distal regions support different functions, and that functional specialization is anatomically stable across subjects. When viewed through this lens the semantic system can appear modular, with different regions representing different conceptual domains. Computational models of semantic memory, however, propose that representations are highly distributed: the very same neurons encode information about different conceptual domains, neighboring voxels potentially encode quite different information, and distal regions can contribute to the same representation. Standard neuroimaging methods adopt assumptions that will destroy a distributed signal, and so cannot be used to test this hypothesis. We adopted a different approach in an analysis of event-related fMRI data collected while participants made semantic judgments for each of 900 words. Voxels that carried information about animals or manmade objects were identified by training classifiers with L1-regularized logistic regression. When constrained to be very sparse, the classifiers showed above-chance generalization to novel words. The voxels selected by each classifier overlapped more than would be expected by chance, suggesting that some voxels contribute to both animal and artifact representations. Compared to standard univariate analysis, the sparse regression analysis revealed a very different picture of the neural systems that support word-meaning. Whereas the former identified left-lateralized regions in the medial temporal lobe, occipital cortex, and parietal cortex that responded selectively to animal or to manmade object words, the latter method revealed a bilateral network in antero-ventral temporal and frontal regions, with most voxels contributing to the representation of both conceptual domains.
Abstract: Many studies have demonstrated that children younger than 30 months of age - unlike their slightly older counterparts - learn better from in-person demonstrations than they do from comparable video demonstrations, a phenomenon known as the video deficit. After a decade or more of research documenting that the video deficit exists, our lab and others are trying to better understand why it exists. Hypotheses include differences in online processing, dual representation, memory retrieval and transfer, inhibitory control of behavioral responses, need for socially contingent stimuli, and others. I will present new data from a number of studies that address some of these hypotheses. These studies explore young children's attention to video and real-life events (as measured by eye movements) as well as toddlers' learning (e.g,. word-learning, object-retrieval) from in-person displays, traditional/non-interactive video, and "interactive" video that is contingent on children's responses.
Abstract: Although the ability to approximate the number of objects in a set without counting is present even in young infants and non-human animals, the acquisition of exact number concepts such as precisely 5 or 9 depends on education. However, the mechanisms that transform children’s pre-verbal approximate number representation into exact symbolic number representations are poorly understood. A recent computational model has suggested that the same neurons that initially represent approximate non-symbolic quantities also come to represent Arabic numerals precisely (Verguts & Fias, 2004) and fMRI studies with adults have provided indirect support for this hypothesis (Piazza et al., 2007). In this talk, I will present behavioral and fMRI data on the developmental mechanisms that transform approximate non-symbolic representations into exact symbolic ones, and the important role that educational experiences may play in this transformation. As part of a larger Educational Neuroscience investigation of children’s number concepts, we scanned 63 children in grades K-3 to assess neural responses to arrays of dots and symbolic numbers. After adaptation to a stream of either 6 or 8 dots indicating, we measured responses to deviant digits (5 or 9) that either semantically similar to the adapting quantity (5 vs. 6 and 9 vs. 8) or far from the adapting quantity (9 vs. 6 and 5 vs. 8). Average fMRI responses to digits in parietal cortex increased with increasing age, education and math ability. Additionally our data suggested that changes in fMRI responses were primarily driven by increasing novelty responses to the semantically similar numbers, indicating increasing precision in these parietal systems. However, age and education were highly correlated in this study. I will conclude with a description of a new study we are embarking on to disentangle these issues.
Abstract: People's knowledge of language includes information about its statistical properties. Similarly, a speaker's attitudes and opinions about a person or topic are implicit in the lexical statistics of what they say. In this talk, we will present a sequence of analyses of television news broadcasts, where we measure the frequency and positivity of usage of high profile politicians, groups, events, and topics. These analyses reveal a number of findings: (1) there is a strong relationship between words' frequency and words' positivity, deriving in part from the higher frequency of positive words in the English language; (2) these frequency and positivity relationships can be used to compare different television networks, with regard to differences in the positivity of coverage; (3) frequency and positivity can be compared to various external data points, such as real-world events and approval ratings.
Abstract: The prefrontal cortex (PFC) plays a central role in a diverse set of cognitive and behavioral processes, including sustained attention, working memory, and behavioral inhibition. In delayed response tasks that probe working memory and other PFC functions, fluctuations in spiking activity rates of PFC neurons are posited to reflect the maintenance of attentional processes, abstract rules, or past stimuli and events during delayed-response tasks of working memory. However very little is known about coding mechanisms that maintain this information for long periods of time across the delay period. Additionally, stress impairs higher cognitive processes dependent on the PFC. However, surprisingly, to date the actions of stress on PFC neuronal discharge in animals engaged in tasks of working memory also remain unknown.
In this talk, I will present neurophysiologic data recorded from the PFC of rats performing a delayed-response task of spatial working memory that refutes the predominant theories related to stress-induced cognitive impairment. In addition, I will present a model of the conditional intensity of neuronal spiking within the PFC that allows these neurons to maintain information for extended periods of time; spiking history predicted discharge (SHPD). This model is framed within the context of a generalized linear model to address several questions related to the actions of PFC neuron function and stress. First, to what degree does past spiking activity of PFC neurons predict or modulate ongoing activity of these neurons? Second, does stress have an overall impact on the predictability of PFC neuron discharge given a cell’s intrinsic spiking history? Third, do specific task components (i.e. delay-period vs. behavioral response) interact with or modulate SHPD during baseline and acute stress?
Abstract: Human list production is the process of a person spontaneously generating an ordered list of items. Such lists have important applications in cognitive psychology and machine learning. I present a new sampling-with-discounted-replacement model for human list production.
Its relation to classic sampling with/without replacement will be made clear with several urn-ball models. I give two real-world applications of this new model: (Psychology) in verbal fluency, our estimated parameters align well with psychological factors thought to influence behaviors in healthy and brain-damaged humans; (Machine learning) in document categorization, our model improves the accuracy of text classifiers trained from feature-label pairs spontaneously produced by human teachers.
Abstract: People learn about the environment both from direct observation and from communicating with others. One may walk outside and see, hear, and feel the rain, or one can be told “it’s raining.” Similarly, one can learn that there is a dog behind a fence by hearing its bark, or by being told that “there is a dog there.” Both types of information—verbal (words) and nonverbal (“signals”)—can be highly reliable, but I will argue that there are several key differences that help explain why words such as “dog” seem to evoke concepts more effectively than sensory signals such as a barking sound. One such difference concerns the ways in which words and signals vary (and do not vary) with their referents. I will present a modeling framework and lots of new behavioral data speaking to unique design features of words and relate it to basic questions concerning the role of language in human categorization.
Abstract: Due to the practical limitations of data collection from human-subject studies such as the cost or duration of the experiment, the patience or attention span of an individual subject, and the desire to obtain a base level of statistical significance, the scale and scope of experiments is often very restricted. However, it is often found after the fact that there is structure underlying the data. For instance, Gary in last week's Hamlet pointed out that given 4 arbitrary non-English words, one is routinely chosen to be the most likely synonym of "pin." This implies that the space of English words is packed into a space and that each word is not a completely arbitrary symbol. In this talk I will discuss algorithms that exploit this kind of structure so that in the experimental setting, the same conclusions can be made with the same statistical guarantees using drastically fewer questions to human subjects. I will also discuss an iPad application that I developed that implements some of these ideas which exploits the space of beer to learn a subject's preferences over all styles of beer using a very small number of questions.
Abstract: Does the sustained, elevated neural activity observed during working memory (WM) tasks reflect the short-term retention of information? The sensitivity of this activity to variation of memory load, typically observed in the intraparietal sulcus (IPS) and dorsolateral prefrontal cortex (dlPFC), is often taken as evidence that these areas play a critical role in retention. However, using multivariate pattern classification (MVPC), we and others have shown that stable stimulus-specific information can be recovered from posterior visual areas, but not parietal or frontal regions, during the delay period of a working memory task. In two recent studies we investigated this discrepancy by looking directly at what information is carried by load-sensitive delay-period activity using fMRI. Subjects were scanned while performing a delayed-recognition task for the direction of 1, 2, or 3 patches of coherently moving random dot stimuli. Load sensitive regions were identified in both the IPS and dlPFC using a univariate GLM. MVPC failed to show that these regions represent the direction of motion that is being remembered. For extrastriate cortex, in contrast, which did not show load-sensitive delay-period activity, MVPC indicated delay period retention of the direction of motion. These results support the view that frontal and parietal cortex represent high-order information about WM tasks, but are not directly involved in the storage, per se, of information. Closer inspection of the wights underlying the classification performance, however, brings into question the results. Seemingly paradoxically, important voxels for whole-brain decoding of stimulus identity were found in parietal and frontal regions, and appear to wax and wane in importance over the delay.
Abstract: The unbiased estimation of causal treatment effects from observational data requires a statistical analysis that conditions on all confounding covariates. All the confounding (i.e., selection bias) can only be removed if the selection mechanism is ignorable, that is, if all confounders of treatment selection and potential outcomes are available and reliably measured. Ideally, covariates are selected according to well-grounded substantive theories about the selection process and the outcome-generating model. However, with weak or no theories about these two matters, covariate selection strategies become more heuristic. In my talk I briefly introduce the “Rubin Causal Model”, discuss classes of bias-inducing and bias-amplifying covariates, and outline different strategies for selecting covariates in practice.
HAMLET mailing list
Tim Rogers (ttrogers@wisc.ed), David Devilbiss (ddevilbiss@wisc.ed), Chuck Kalish (cwkalish@wisc.ed), and Jerry Zhu (jerryzhu@cs.wisc.ed) (Add 'u' to the addresses)