Just sent this email to the currently enrolled students in my upcoming PSYCH 315 course this summer. Posting here in case others are thinking of taking it. Want to be clear about the pace.
Good morning this Fri. of Finals week,
I hope you are each done with your spring quarter, that it went reasonably well, and that you enjoy the slice of time before our B term 315 class starts. This is not to say that our 315 class won’t be enjoyable. It can be, especially if you like stats., but it is an atypical sort of immersive, all-encompassing, no time for anything else sort of enjoyment. Like virtual reality, only not virtual and all stats. More on that below.
The first thing to say is that we are using a very nice textbook. It is in its 7th edition. You can purchase the new edition, but you can also use an older edition. There should be lots of 6th editions floating around. The authors are King, Rosopa, and Minium.
Now back to the immersive nature of our four week 315 class. If you already know all this material well, then what I say here doesn’t apply to you. But if you are learning stats. for the first time and you don’t want to take this class again, try to clear your schedule during our class. I understand you may have work or other responsibilities you cannot simply pause. Try to pause anything optional. I don’t recommend taking other intensive courses during this four weeks. I would also get ready to have a diminished to non-existent social life during these four weeks. The pace is intense and if you get behind, catching up is difficult.
There will be support to help you learn this material. We have a wonderful graduate TA, and I believe we’ll have a handful of undergrad. peer tutors, as well. Thus assistance outside of class as needed should be available.
Take care! Enjoy June and see you toward the end of July,
Here’s a description of a new course I’m teaching next quarter (Spring, 2019).
PSYCH 548 ADV QUANT PSYCH: Exploratory Data Analysis in Psychology
From very early, Psychology as a discipline has emphasized hypothesis driven research. At the same time for decades, exploratory statistical approaches and algorithms, such as exploratory factor and principle components analysis have been key analytical approaches. Additionally, machine learning and other exploratory algorithms are being embraced across many scientific domains, including Psychology. What about this disconnect between a historical disdain for exploratory analysis with the current interest in complex exploratory computational procedures? How have and could we think about exploratory data analysis in Psychology? How should exploratory work be used to extend Psychological knowledge and theory?
In this seminar, I hope we will consider these issues, as well as some exemplar models and approaches. We’ll read historical and more recent papers on exploratory analysis generally, as well as focus on some specific models (some possibilities include exploratory factor analysis, canonical correlation, cluster analysis, some machine learning approaches, exploratory SEM; suggestions welcome). Classes will focus on discussion of the material, implications for research (broad, as well as specific to a single research area). You will be encouraged to work with your own data for the class, and we’ll strive to work some of those analyses into class meetings. I’m considering brief weekly reaction papers or a brief analyses using some focal model on your own data. A final paper will compare the scientific upshot of a few exploratory approaches applied to your data.
My goal is that we come out of this seminar thinking more broadly and critically about exploratory analysis in Psychology.
In my graduate class on path analysis, we do a lot of analysis on our own data. This year, I suggested that people consider analyzing simulated data based upon the statistics of their data. This way they’ll use a data set that looks like their data, but they aren’t doing a lot of model fitting on data they care about and what to use in real research. Thus, today I typed up a quick guide to simulating multivariate normal data in R for use in our class.
If you find typos, errors, etc., please let me know.
I’ll be teaching Confirmatory Factor Analysis and Structural Equation Modeling next fall (listed as PSYCH 548 for 5.0 credits).
First, if you don’t know, I encourage you to bring your own data to use in the class. You have to be able to share some form of it with me, like a covariance matrix. I won’t share, distribute, or use it for my own work. You’ll submit your R syntax and the data, so I can help debug and provide feedback. Second, I’m planning to stick with R, although maybe look at some other packages besides lavaan. In the past, I’ve let people use other programs, like Mplus, but I don’t think I’m going to do that anymore. Third, I’m planning to reinstate writing three research papers (one for each major topic: observed variable path anaylsis, confirmatory factor analysis, and latent variable path analysis), although with a peer review component. I’m also thinking about adding some work on simulation and power analysis. In the past, people have turned the class’ papers into thesis chapters or publications, so if you plan for it, you may be able to do the same thing.
For the class to be useful to you, you’ll want the following:
- hypotheses (or the ability to create such) about how your data may be structured and tested; this is not an exploratory data analysis class.
- if you have many more observations than variables, you need a “large” sample (probably greater than 100, over 300 better); In some cases as few as 80 people will work, but the class can be more challenging/frustrating and/or the models quite limited.
- if you have many more variables than observations (e.g., time series, physiological, and/or neruoscience data), you’ll need to think about intra-individual covariance structure and pooling (or not) across people.
- either way, you want to think about “redundant measures”. Items, measurements that are “getting at the same thing” and can be structured.
- however, if you have tiny cross-sectional or two time point data sets (say, N=20) with few variables on each respondent, the latent variables part of this class probably won’t work for you.