Avsnitt

  • Access the full transcript for this episode

    “I never thought I would find that sense of community here, especially as a transfer, because I've heard so much about the stereotypes…I think a club really helped combat that” —Avani Gireesha

    In our final episode of Season 9, we hear from three graduating UC Berkeley seniors, all of whom transferred from California community colleges into the Data Science major: Avani Gireesha, Hannah Brown, and Jake Pastoria. They reflect on their transitions from community college to Berkeley, discussing the clubs, research, and experiences they’ve gained in their two years here. Listen in as they offer advice for incoming transfer students on how to prepare academically, find community, and get the most out of their Berkeley experience!

    “I'm still not really used to the exam rigor here and how difficult it is, but that's totally okay. I feel challenged here, and it really pushes me to get out of my comfort zone and be a better student” —Hannah Brown

    “I think these classes change the way that I view education as a whole…I'll never forget opening up my first Data 8 Jupyter notebook and submitting it. Education here is really cool, and I think you should take all these classes, especially when the professors are absolute legends in Berkeley and just computer science and data science in general” —Jake Pastoria



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit datascienceeducation.substack.com
  • Access the full transcript for this episode

    “There's a famous quote by a statistician, John Tukey, who's often associated with sort of introducing and promoting the concept of exploratory data analysis. And his quote is that the best thing about being a statistician is that you get to play in everyone's backyard, by which he means, as a data scientist, you get to dabble in all of these different areas…the longer you work in statistics, data science and adjacent fields, you really start to see that all these stories around data that come up in different disciplines, they're actually linked through the language of statistics and mathematics. So when I start a new domain, I will usually try to start by reasoning by analogy” —Prof. Alex Franks

    In this week’s episode, we talk with Professors Mike Ludkovski and Alex Franks from UC Santa Barbara about their diverse research backgrounds—ranging from stochastic modeling to sports analytics—and how they shaped their approach to data science education. Mike and Alex discuss the value of co-teaching, designing interdisciplinary curriculum, and helping students connect theory to real-world practice. They also touch on some major initiatives aimed at expanding access to data science education, including the Southern California Consortium and the Pacific Alliance for Low-Income Inclusion.

    “We found out… the awareness of data science is vastly different across campuses within just a few miles of each other… we are trying to help different places stand up data science courses, programs, and share best practices. We organize events like datathons for high school and community college students” —Prof. Mike Ludkovski



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit datascienceeducation.substack.com
  • Saknas det avsnitt?

    Klicka här för att uppdatera flödet manuellt.

  • Access the full transcript for this episode

    “Lo nuevo que va a entrar al curso esta vez es la pregunta de qué hacemos con las herramientas de inteligencia artificial en este contexto. ¿Cómo? ¿Cómo usar? Yo no voy a pretender que eso no existe. Yo creo que es absurdo hoy en día imaginarnos que los estudiantes no lo van a usar. Prohibirles usar esas herramientas yo creo que es, es, es fútil. Entonces la pregunta mía es bueno, cómo le creo a los estudiantes un ambiente en el cual sepa que su privacidad está siendo respetada, que tienen acceso a herramientas que pueden usar potencialmente en su propio computador.”

    In our second Spanish-speaking episode of the podcast, Eric Van Dusen and special guest host Edwin Vargas Navarro sit down with Fernando Pérez, who is the Faculty Director of the Berkeley Institute for Data Science at UC Berkeley (BIDS), a Professor of Statistics, and co-founder of Project Jupyter and IPython. Fernando reflects on his path from physics to computational science, as well as the role of open-source tools and interactive computing in the development of Juptyer Notebooks. We touch on the evolution of Jupyter and how it furthers interdisciplinary and reproducible collaboration, and discuss Fernando’s teaching philosophy through courses like STAT 159, a course that emphasized reproducibility and collaborative computing. He speaks on the challenges of AI integration in education, and offers broader advice to fellow data science educators on how to approach this quickly-evolving landscape.

    En nuestro segundo episodio en español del podcast, Eric Van Dusen y el invitado especial Edwin Vargas Navarro conversan con Fernando Pérez, quien es el Director de Facultad del Berkeley Institute for Data Science (BIDS) en UC Berkeley, profesor de Estadística y cofundador de Project Jupyter e IPython. Fernando reflexiona sobre su camino desde la física hasta la ciencia computacional, así como el papel de las herramientas de código abierto y la computación interactiva en el desarrollo de los Jupyter Notebooks. Tocamos la evolución de Jupyter y cómo promueve la colaboración interdisciplinaria y reproducible, y discutimos la filosofía de enseñanza de Fernando a través de cursos como STAT 159, un curso que enfatizaba la reproducibilidad y la computación colaborativa. Él habla sobre los desafíos de la integración de IA en la educación, y ofrece consejos más amplios a los educadores de ciencia de datos sobre cómo abordar este panorama en rápida evolución.

    “Porque si bien la matemática puede ser la misma, el valor de la ciencia de datos es que no es puramente probabilidad estadística o álgebra lineal. Es que esos datos vienen de algún lugar concreto, vienen de una comunidad, vienen de un grupo de personas, se reflejan, reflejan aspectos de ese contexto local y las decisiones que se van a tomar sobre esos datos van a afectar a una comunidad local.”



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit datascienceeducation.substack.com
  • Access the full transcript for this episode

    “I think a lot of times, we focus on data science as a tech thing, right? Oh, you're going to go work for Meta. You're going to go work for Google. You're going to go work for insert tech company here or AI startup here. And for a lot of students, especially a lot of my students, they really want to contribute to their communities and give back, right? They're thinking about how to make their community stronger. And when we only focus on the tech approach, that's very sort of up here, over there, you know, they know they'll make good money. And so they might pursue that, but they don't realize that data science can be used for a lot of good as well. You can use it in ways that actually serve the community, serve the world, from helping develop algorithms that can read MRIs or other medical imaging data, to help diagnose some sort of disease or cancer, or to identify human rights violations by being able to search massive amounts of documentation.”

    Today, we sit down with Judith Canner, a professor of statistics at California State University, Monterey Bay. Judith begins by reflecting on her role in redesigning first-year mathematics and statistics courses in response to some of the CSU’s executive orders, which took away traditional remedial mathematics classes. She explains to listeners how co-requisite courses and active learning strategies help students succeed, as well as touches on the importance of quantitative reasoning across a variety of disciplines. She talks about the effectiveness of pair programming within her teaching strategies, and implores people to reframe data science as a tool for social impact rather than just a way to a high-paying traditional tech job. Judith ends off by reminding fellow data science educators that data science is constantly evolving, so educators shouldn’t be afraid to embrace change and collaboration.

    “Don't be afraid to take a chance. The reality is that data science is still a little undefined and still constantly changing. And working in the Cal State system, I'm often confined by the system itself, right? We have to work within multiple systems when it comes to curriculum, but I'm seeing more and more educators really taking risks and more and more folks really thinking about, can we do this a completely different way than we've always done it? And so, not being afraid to take those risks. Can we teach math in a way completely different than we've always done it? Being OK with letting go of the status quo…”



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit datascienceeducation.substack.com
  • Access the full transcript for this episode

    “It was in the 1970s that David Friedman and his colleagues completely changed the way statistics is taught in the world, from going from just an emphasis on calculation, calculation, calculation, without really paying any attention to, what's the question, and what can you do with the answer?… Why does anyone care? What is the calculation that you can justifiably do, given the information at hand? And then how do you interpret the answer? That is traditional statistics teaching, and I haven't strayed one step away from it. I'm still there. It's called data science now. The tools are different. And because the tools are different, we are empowered to ask questions that we wouldn't have dared to ask before. And we can answer it in ways that we couldn't before. But I still think I am teaching traditional statistics.”

    Today, we sit down with Ani Adhikari, a pioneer in building data science at UC Berkeley. She explains that traditional statistics education at Berkeley has always emphasized conceptual understanding, which she continues to aim to bring to the data science curriculum. Discussing teaching methods, she reassures statistics educators transitioning into data science that they don’t need to fundamentally change their approach—just the tools they use. Looking towards the future, Ani emphasizes AI’s rapid development, stressing the importance of equipping students with fundamental reasoning skills that will remain relevant regardless of how the industry continues to change. She ends by urging fellow educators to respect the history of data science, build on it, and remain aligned with their own intellectual and philosophical teaching goals.

    “Think about why: why are you wanting to be a data science educator? That answer will be very different for many people. But trying to get to the core of that answer is the key. What is your intellectual, philosophical reason? And then make sure that everything you do, you always ask yourself: am I achieving those philosophical intellectual goals that I had? And please, please, please respect the history. Do not think of data science education as something brand new. It has been happening since people started making decisions… Know the history, respect the history, and build on it. And then you will be fulfilled, and so will your students be.”



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit datascienceeducation.substack.com
  • Access the full transcript for this episode

    “I think one of the things we've approached in our data science curriculum is this idea that data science is a team sport…You're never really doing data science on your own. You're always in a team and you're working with product managers. You're working with end users. You're working with software engineers. You're working with salespeople. And that idea of how do I translate people's problems? What is my system going to do? What are the variables and what considerations I have when I'm designing a system with people? What are the algorithms going to do and what does that mean? So that kind of idea of treating it as a team sport and figuring that out as a student, is like a fundamental principle for how we do data science in these environments.”

    In this episode, we sit down with Paul Groth, Professor of Algorithmic Data Science at the University of Amsterdam. Throughout the episode, Paul shares the structure of data science and AI education in the Netherlands, highlighting how the Netherlands had AI undergraduate programs before data science became mainstream. He touches on the differences between AI education and data science, as well as his thoughts on treating scholarly publishing as structured data in his role as co-scientific director of the Discovery Lab. Finally, he ends with his approach to teaching and mentorship, approaching data science as a “team sport”.

    “In the Netherlands, we taught AI before we ever taught data science. So actually, we have I think one of the first few places in the world where we have a bachelor's in AI. So I think it’s different from the US system, where you might start off with a general curriculum and then specialize in something like computer science. Here, our undergraduate students come straight from high school and go directly into a subject. So we have, for example, they can go directly into AI and then they'll do a three-year bachelor's in AI.”



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit datascienceeducation.substack.com
  • Access the full transcript for this episode

    “A lot of times, you think as a young woman, well, I want to do something where I can care for the world, where I can make a difference in the world to make it better. Women often don't relate that to computer programming. And so that's where data science comes in and tells a story, where it combines not just the computational skills, but the statistical skills. But then there's also a requirement of having domain expertise. And that domain expertise, that can be anything. It can be healthcare. It could be climate change. It could be whatever field you want to enter. And so it's easier to relate that to caring for people or caring or having an impact in the world.”

    Today, we welcome Nathalie Guebels, Assistant Professor of Computer Science at Santa Barbara City College (SBCC), as she shares her transition from industry to academia and her larger commitment to making data science accessible to community college students. We touch on the development of SBCC's data science pathway, designed to create smooth transfer opportunities for students entering four-year universities. Nathalie also highlights her passion for supporting women in STEM, and details how she incorporates real-world datasets tailored to SBCC students. She talks about designing and co-teaching the cross-disciplinary course "Data Science for All," as well as reflects on the key role that collaboration has played in shaping SBCC’s growing data science program.

    “Bringing more real world examples and your own stories into the classroom gets the students more engaged and excited with the material because they connect it with their lives and their interests…I'll look at data from my classes, or we'll talk about SBCC farmers markets. But I think one of the main things we've changed…is bringing more focus on SBCC students and on the students themselves.”



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit datascienceeducation.substack.com
  • Access the full transcript for this episode

    “If you're worried that the AI hype is subsuming data science, remember: you can't have AI without data—data science isn’t going anywhere.”

    Welcome to Season 9 of the show! To start off the new year, listen in on our conversation with Micaela Parker, founder and executive director of the Academic Data Science Alliance (ADSA). Micaela shares her journey into data science leadership, emphasizing the importance of inclusivity, interdisciplinarity, and community-building in the evolving fields of data science and AI. She reflects on the founding of ADSA and its mission to support faculty, staff, and students in designing, building, and sustaining data science programs. She discusses how ADSA fosters collaboration through annual meetings, working groups, and workshops that showcase innovative pedagogy and best practices, such as teaching responsible AI and integrating social justice into data science education.

    “We actively solicit keynote speakers, panelists, and participants for all of our events from a diversity of backgrounds and institutions, because we believe strongly that you can only aspire to a career that you see yourself doing, and that starts with seeing someone like you in that role.”



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit datascienceeducation.substack.com
  • Access the full transcript for this episode

    “I would recommend double majoring with a different degree, because I think while data science by itself is a very, very useful and versatile degree, I think being able to apply it to a particular domain overall makes you a better statistician, or economist, or historian, right?”

    —Alan Liang

    In the final episode of the season, we explore the pivotal role students played in shaping Berkeley’s undergraduate data science program. We sat down with three alumni — Alan Liang, Vinitra Swamy, and Gunjan Baid — who were instrumental in building the foundations of data science education at Berkeley. They reflect on their unique contributions, including developing curriculum, infrastructure, and interdisciplinary initiatives, and how those experiences shaped their career trajectories. From Alan’s insights into teaching technical concepts, to Vinitra’s innovative work on scaling Jupyter infrastructure, and Gunjan’s efforts on connector courses and technical systems, we highlight the long-lasting impact of student-led innovation.

    “I really loved the experience here of being a graduate student….there's a very collaborative atmosphere that people are always super excited about working on what they're working on, and that passion is what really drew me to the PhD as well. Like, the excitement to work on ideas that might be a bit too risky, that might be a little bit out there, a bit crazy, but you know, trying to get it to work and to work alongside people that are willing to put in the late nights and early mornings, because they want to, not because someone is forcing them to.”

    —Vinitra Swamy

    “Really dig deep and make sure you understand the details of a problem that you're working on. This still comes up a lot for me, but if something seems like it's off, if you're training a model and something looks funky, it's probably because something is off. And I think it's easy to kind of brush over the details and kind of gloss over that. But more often than not, really kind of getting your hands dirty, peeling back the layers, looking at the data, and going deep on a problem is how you'll make the most progress.”

    —Gunjan Baid



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit datascienceeducation.substack.com
  • Access the full transcript for this episode

    “We introduce a new data set to them every week, and we try and use data sets that are either themed around Illinois, or themed around things that we think that they are interested in. And so that's been something that we started doing when we first piloted the course, and have continued to do that each semester. And the students really are invested in the course, because they're using real world data that they have questions about.”

    —Karle Flanagan

    In this episode, we explore the creation and growth of STAT 107, the University of Illinois Urbana-Champaign’s introductory data science course designed to be accessible to all students, regardless of major or background. We sat down with Karle Flanagan and Wade Fagen-Ulmschneider, the teaching professors behind STAT 107, to discuss their journey from a pilot program with 18 students to a thriving course with 1000+ students today. They detail how they built a curriculum that combines computer science and statistics, while keeping students engaged through real-world datasets, interactive live demos, and interdisciplinary collaboration. They delve into the challenges of scaling the course, the importance of co-teaching, and their broader efforts to expand data science education to high schools through initiatives like the DPI Digital Scholars Program.

    “So in the very beginning, we actually started by being like, there's going to be a CS day and a Stat day, and that I would give a CS lecture, and then Karle would be there, kind of just sitting in the audience, and then Karle would give a Stat lecture the next day, and they'd be inner related, but they were kind of separated. And then one day, we were just like, I want to kind of get Karle's opinion on something and let give her perspective, because I come from an engineering background, and I am obsessed with formulas. Karle, I think, really relies more on, like, tables and graphs and like, really wants to understand the story behind data, and only once you're motivated by the story do you really want to dive in deeper. And so the way we see problems are wildly different…Students just love the fact that it's back and forth.”

    —Wade Fagen-Ulmschneider



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit datascienceeducation.substack.com
  • Access the full transcript for this episode

    “One of the ways we incorporate ethics is by trying to expose students to a plurality of perspectives. So we want students to hear from people with different perspectives on what it means to engage with data ethically, and so we do this by hosting guest speakers. We encourage students to take classes in a variety of departments around campus. We also try to introduce students to frameworks that can help them think about how to incorporate diverse perspectives in the creation of tech products and policy.” —Mallory Nobles

    Today, we sit down with Dennis Sun and Mallory Nobles from Stanford University to discuss the university’s innovative approach to undergraduate data science education. Dennis and Mallory share insights into Stanford's dual-track offerings: the technical BS in Data Science and the interdisciplinary BA in Data Science & Social Systems. They dive into the origins and goals behind these programs, highlighting how they equip students with essential skills in data science, statistics, and ethics. The conversation also covers Stanford's emphasis on experiential learning through capstones, project-based courses, and partnerships with fields like neuroscience and engineering.

    “When I came to Stanford, one challenge that was clear to me was that there were hardly any data science and machine learning classes that were accessible to freshmen or students early on in their college careers. So many of them were gated behind probability, linear algebra, and even several computer science courses. And it's a lot to ask a student to take a bunch of theoretical courses before they get to find out what data science is really about. So that was kind of the genesis of the Principles of Data Science course. It was designed to give students a sense of what data science is about, and it gives them the practical motivation to convince them that all the theoretical courses that they'll have to take are going to be worth it in the end.” —Dennis Sun



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit datascienceeducation.substack.com
  • Access the full transcript for this episode

    “Entonces lo que yo procuro hacer con los estudiantes que son de áreas como de humanidades o ciencias sociales, es asociarlo como a situaciones cotidianas, haciendo analogías o buscando ejemplos de cosas que cualquiera ha experimentado. Eh como que se desarrolle esa intuición y ya después pues lo lo le ponemos como él la forma de de la sintaxis y ya el lenguaje específico que usemos”

    In the podcast’s first ever Spanish speaking episode, Eric Van Dusen and special guest host Edwin Vargas Navarro sit down with Camilo Andrés De La Cruz Arboleda from the Universidad Externado de Colombia. Camilo shares his journey from studying law to embracing data science and technology, merging the two fields to innovate legal education in Colombia. He discusses how he engages law students with data science concepts, making technical subjects accessible to those without a STEM background. Camilo also explores the challenges of teaching data science in Latin America, the importance of open data, and the role of data science in sustainability and public policy.

    En el primer episodio en español del podcast, Eric Van Dusen y el invitado especial Edwin Vargas Navarro conversan con Camilo Andrés De La Cruz Arboleda de la Universidad Externado en Colombia. Camilo comparte su trayectoria, desde estudiar derecho hasta abrazar la ciencia de datos y la tecnología, fusionando ambos campos para innovar la educación legal en Colombia. Habla sobre cómo involucra a los estudiantes de derecho con los conceptos de ciencia de datos, haciendo accesibles los temas técnicos para aquellos que no tienen antecedentes en STEM. Camilo también explora los desafíos de enseñar ciencia de datos en América Latina, la importancia de los datos abiertos y el papel de la ciencia de datos en la sostenibilidad y las políticas públicas.

    “Yo creo que históricamente el derecho ha sido una profesión que ha estado muy reacia como a a aceptar como una revolución tecnológica y por lo menos acá en Colombia, hasta incluso hace muy pocos años se permitía hacer una audiencia por una videollamada o incluso radicar documentos por un correo electrónico es que algo que existía hace miles de de años hasta ahora, hace recientemente se se pudo incorporar dentro de del día a día de la carrera de los abogados. Si uno quiere seguir siendo competitivo, tiene cuanto menos, conocer lo que puede hacer con tecnología e incorporarlo a su a su día a día. Sea un abogado que haga eso, va a estar diez veces más preparado que el que quiera seguir como la en la en la forma tradicional, pues de llevar a cabo la profesión.”



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit datascienceeducation.substack.com
  • Access the full transcript for this episode

    “I literally collected 150 jobs on Indeed.com and parsed out all of the skills that were mentioned in all the jobs, created a graphic and said, Okay, here's the courses we already have that have these skills, and here's the skills I need to create courses for.”

    Today, we sat down with Crystal Wiggins, a pioneering educator in two-year college data science programs at Connecticut State Community College. Crystal shares her journey in developing Connecticut’s first two-year data science program, which has since expanded to five campuses. She discusses her innovative approach to project-based learning, teaching students to "get comfortable with the uncomfortable," and preparing them to adapt in a rapidly evolving field. Crystal also delves into her leadership role in nationwide conversations about data science in community colleges, her work with organizations like AMATYC (American Mathematical Association of Two-Year Colleges), and her vision for industry partnerships in the classroom.

    “Don't be afraid to dive in. You do not need to be an expert. You can learn this with your students. There's many things that students ask me, and I'm like, Well, let me show you how to find the answer. And I was actually finding the answer for myself because I didn't know, but that's what's great about the field; it's more about teaching them how to find answers than it is knowing everything yourself. So again, my slogan, be comfortable with the uncomfortable, is like the slogan for data science for me, because you're never going to know everything, and that's what I tell my students.”



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit datascienceeducation.substack.com
  • Access the full transcript for this episode

    “I developed a class called DS 100, which is in a lot of ways very similar [to Data 8], with the primary objective being, I want people to walk away from the class with saying I understand what data science is. I can do a little bit of programming, and now it's up to me whether I think it's interesting or not. I don't want anyone ever to feel like they can't do it. It's just whether or not they enjoy doing it.”

    In this episode, we sit down with Langdon White from Boston University to discuss his journey from software consulting to becoming a key figure in BU's growing data science program, starting off with BU Spark! He shares the challenges of expanding a data science curriculum, including the launch of new programs, and his overarching mission to make data science accessible to students of all backgrounds. Langdon also explores innovative teaching methods like experiential learning and gamification, while highlighting the importance of diversity, ethics, and inclusivity in data science education.

    “I continue to think that our biggest challenge in this industry is making sure that we have representation from all backgrounds, right?…Every student should be walking out of the school with an expectation of inclusion and diversity, but also ethics. And that the ethics falls to you…and you know, encouraging students to step up and represent themselves, from an ethical perspective, an inclusion perspective, and the diversity perspective.”



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit datascienceeducation.substack.com
  • Access the full transcript for this episode

    “So what we do in Data Feminism is try to synthesize a whole lot of feminist ways of thinking about the world, that have to do with questions of bias and oppression, that have to do with questions of sort of unequal power, and who gets to make choices about how to design systems — with these sort of really broad social questions, we try to apply them to data science as both a field and as a practice.”

    Join us as we engage in a conversation with Lauren F. Klein, Associate Professor at Emory University and Director of the Digital Humanities Lab. Klein shares her unique journey from a background in comparative literature to pioneering the field of digital humanities, where she bridges the gap between computational methods and humanistic inquiry. We delve into her innovative projects, including her influential "Data Feminism" book and the "Data by Design" project, exploring how these works challenge traditional data science perspectives and emphasize the importance of context, history, and ethics in data visualization.

    “The point that I'm trying to make in this project is that if we take this historicized, almost literary and critical, humanistic lens to this history, we can see how the people who were designing data visualizations were either asking very similar questions to the kinds of questions about responsible data visualization that we're asking today, or they weren't. And because of that, we can see how their visualizations — far from being some sort of neutral representation of data — in fact, represented a certain policy sort of unreflective politics that I think we also need to be able to identify again, so that we don't reproduce that in the present.”



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit datascienceeducation.substack.com
  • Access the full transcript for this episode

    “We're kind of in an early phase among most social scientists, trying to figure out what's new here, what's different, and how to integrate it with our standard social science methodological concerns, which I don't think we should abandon. Thinking about the relationship between theory, concept and measurement. For example, that's one of the things that social scientists bring to the table in data science projects: thinking about questions of representativeness, generalizability, and questions of causal inference.”

    Welcome to the season 8 premiere! In this episode, we sit down with David J. Harding, a professor in the sociology department at UC Berkeley. David shares his unique academic journey in sociology and data science, emphasizing the integration of social science methodologies with data science tools. He discusses his work on poverty, inequality, and incarceration, and the challenges of using administrative data in research. The conversation delves into future directions for his research on adolescents and urban communities, the importance of bridging social science and data science education, and strategies for creating inclusive classroom environments.

    “A standard complaint about running and estimating models in the social sciences is that we make a lot of assumptions, and then don't have the ability to test them. Then right along comes the kind of more machine learning type workflow, which allows us to learn what the model should look like from a portion of the data, and then test it and validate it on another portion of the data. I think social scientists should be building that sort of workflow into our normal work process all the time.”



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit datascienceeducation.substack.com
  • Access the full transcript for this episode

    “You can be hoodwinked with data in the same way that you can be hoodwinked by a car salesman. And so the idea of [Calling B******t] was to step away from all the details of the black box: that's the statistical procedures, the algorithms, etc. (Not to say that we don't pay attention to what we do.) But the idea is to really pay attention to the input data that's coming in—to think about things like selection bias—to think about where that data is coming from.”

    Join us in our Season 7 finale as we host Jevin West, an associate professor at the University of Washington and a co-founder of the Center for an Informed Public. Dive into a deep discussion about the intersection of data science and misinformation, the challenges of big data, and the ethical considerations that come with it. Jevin shares his experiences from the early days of data science programs, his insights on combating misinformation through education, and the evolution of his course and book, "Calling B******t." Whether you're a data science professional or a student, listen in to explore how data science education can empower us to make informed decisions and foster a more truthful society.

    “One of the most important skills that we're going to want to enhance more and more is humaneness…things like being able to ask questions, to sort of work through logic to really tease out things, like correlation versus causation. Machines don't tend to do so well [with those things]—they don't have access to the physical world. That's one of their weaknesses. So you want to lean into your strategic advantages as humans…maintain that humaneness by doing things that machines can't do.”



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit datascienceeducation.substack.com
  • Access the full transcript for this episode

    Join us as we speak with three different guests, all UC Berkeley Data Science alumni, who have gone on to pursue higher education. Ranging from learning sciences to epidemiology, our guests share their experiences, challenges, and insights into how their data science education prepared them for their current paths.

    Ashley Quiterio, a PhD student in Learning Sciences at Northwestern University, delves into the intersection of data science and education, highlighting the transformative potential of data-driven approaches in shaping learning environments.

    “Try everything and try different things. I mentioned all these different roles [I did during undergrad], where I was trying to see where I fit, deciding what I like about data education. There's all these different lenses and different ways of thinking about where you fit. So I'd encourage people to try that out, early and often. Data science is such an interdisciplinary field that you're not going to be lacking opportunities.” — Ashley Quiterio

    Anna Nguyen, a PhD student in Epidemiology and Clinical Research at Stanford University, shares her journey from data science to public health, emphasizing the importance of interdisciplinary collaboration in addressing complex health challenges.

    “Regardless of what anyone says, there's no pure cut way of getting into grad school. Pursuing opportunities that allow you to really explore your interests and displaying a willingness to learn is probably the best way to prepare for a masters or a PhD program. I think I definitely overestimated how much time I had in undergrad. And the time was so limited and valuable, so it's really not worth doing things that you don't enjoy in that limited time.” — Anna Nguyen

    Rodrigo Palmaka, a Masters student in Statistics at UC Berkeley, offers perspectives on computational pathology and statistical research, illustrating the versatility of data science skills in diverse research domains.

    “I think I always sought to focus on the fundamentals—not overfit or pigeonhole myself too much—and give myself some flexibility to, you know, be able to adapt to the next big thing.” — Rodrigo Palmaka



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit datascienceeducation.substack.com
  • Access the full transcript for this episode

    “UC Merced opened in 2005, so we were starting from a very different place than lots of campuses are. So I try very hard to be really intentional about when we think about hiring people; we want to be very aware of ways that unconscious bias plays out in in hiring. When we invite people to give seminars, we try to invite people from variety of backgrounds and campuses. And so I think that being at UC Merced—a new campus with a really strong emphasis on diversity—it's very much something that’s important to the students.”

    Join us in conversation with Suzanne Sindi, Professor of Applied Mathematics and Chair of the Department at UC Merced, as she shares her journey in incorporating data science concepts into her teaching, highlighting the importance of engaging students through real-world applications and interdisciplinary approaches. Suzanne discusses her involvement in diversity initiatives, such as the SIAM Activity Group in Equity, Diversity, and Inclusion, and how it shapes her teaching philosophy and fosters a more inclusive learning environment. We also touch on the challenges and opportunities of data science education in diverse settings, such as UC Merced's Central Valley location, and learn about strategies for preparing students to navigate the evolving landscape of mathematical and computational disciplines.

    “So something like the mean or average value, are words that, you know, have meanings outside of math. And so now you're trying to use this in a context, like in sort of a scientific context. And one of the things I hadn't appreciated is, if you're working with people who potentially don't come from homes where they speak English at home, they don't have maybe the same context for some of those words in those terms.”



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit datascienceeducation.substack.com
  • Access the full transcript for this episode

    “We are definitely a Hispanic enrolling institution, but the TIPS project is aiming to embrace that ‘serving’ term, and just the ideal of serving our Hispanic students. Through the TIPS project, there's a ton of professional development — very deep, profound professional development. We want an entire department to participate in the TIPS pathway because the department is a unit of change, meaning that the entire community and culture of that department will change, rather than just having a few people who are interested in DEI initiatives.”

    Join us in discussion with Dr. Omayra Ortega, a professor at Sonoma State University, as we delve into the evolving landscape of data science education. From her journey as a mathematician with a background in music to her current endeavors in mathematical epidemiology and data science, Dr. Ortega shares insights into the intersectionality between gender, ethnicity, and inclusion in the data science community. As a former president of the National Association of Mathematicians and a passionate advocate for underrepresented groups in STEM, Dr. Ortega discusses the importance of fostering diversity and equity in data science education.

    “If you're a data science educator, make friends with other data science educators because I'm sure they need help. They need your ideas, your models for how you run your degree program, for how you run your classes, and best practices. Go to those lovely workshops that are organized at UC Berkeley every summer and spring — if you're in California, join CADSE.”



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit datascienceeducation.substack.com