Skip to Main Content

First-Year Foundations

This guide will take you through your FFC 100 information literacy session.

AI Literacy and Data Literacy

What is Artificial Intelligence? 

Whether you know it or not, artificial intelligence (AI) is already a large part of your daily life. Much of it is invisible to you, but it's working behind the scenes personalizing your video and music recommendations, customizing your social media feeds, analyzing your spending habits to detect fraud, and touching up your posts with photo filters. 

So, what is artificial intelligence? 

Broadly speaking, artificial intelligence refers to the development of computer systems that can perform tasks that would typically require human intelligence. These systems are designed to learn, reason, and make decisions based on large amounts of data. 

While there isn't a single definitive definition of AI, NASA follows the following definition: 

  • Any artificial system that performs tasks under varying and unpredictable circumstances without significant human oversight, or that can learn from experience and improve performance when exposed to data sets. 
  • An artificial system developed in computer software, physical hardware, or other context that solves tasks requiring human-like perception, cognition, planning, learning, communication, or physical action. 
  • An artificial system designed to think or act like a human, including cognitive architectures and neural networks. 
  • A set of techniques, including machine learning that is designed to approximate a cognitive task. 
  • An artificial system designed to act rationally, including an intelligent software agent or embodied robot that achieves goals using perception, planning, reasoning, learning, communicating, decision-making, and acting. 

source: What is Artificial Intelligence? (n.d.). NASA. Retrieved July 21, 2025, from https://www.nasa.gov/what-is-artificial-intelligence/

One of the most important things to understand about AI is that the decisions made by AI are based on probability and statistics. So, no matter how advanced the system is or how much data went into feeding the machine learning of a particular AI program, the decisions made by AI are not based on the same kind of nuanced and creative reasoning that can be accomplished by a human. Probability-based decisions are often correct, but not always--so there will be times when AI tools make errors or even spread misinformation. 


What is AI Literacy? 

Some fundamental abilities that are useful to all students in today's information environment are: 

    1. Understand the basics of how AI works
    2. Use AI effectively and ethically
    3. Make informed decisions about using AI technologies

Source:  Hennig, Nicole. “AI Literacy: May 17 Webinar.” Nicole Hennig, March 16, 2023. https://nicolehennig.com/ai-literacy-may-17-webinar/.

So, why are we talking about AI literacy?

In the last couple of years, there have been significant advancements in AI that have impacted the daily lives of students. In particular, the release of tools that fall under the category of generative AI. Generative AI refers to a specific class of AI tools that are able to create new content and include tools like ChatGPT, CoPilot, Gemini, Claude, Midjourney, just to name a few. With such powerful new technology at our fingertips, it's no surprise that generative AI continues to transform how we find, use, and create information—in other words, our information literacy habits and abilities. 

Right now is a time when educators and students alike are figuring out how generative AI fits into and affects their learning. Although it is challenging, we are all figuring it out, just as those who came before wrestled with the invention of tools like Google, Wikipedia, and the Internet that similarly changed information behaviors when they were introduced. 

Our goal for AI literacy, then, is to provide guidance to students to help develop useful and ethical information behaviors, including how to contextualize information from AI and think critically about the use of AI in their work. Not only is this important now, as you're starting college, but once you finish college as well, since AI skills are increasingly expected in the workplace.

 

What is a Large Language Model? 

The Generative AI chatbots that we are all familiar with--ChatGPT, Claude, Copilot, Gemini, and such--are powered by Large Language Models (LLMs). An LLM is a software system that has been designed to be capable of understanding and generating human language. Historically, when a software system has been developed so that it can imitate or replicate activities that we associate with human intelligence, activities like understanding, writing, and speaking a language, we label that system as an example of artificial intelligence.  

LLMs have been exposed--in a process called training--to vast amounts of textual data in order to understand and generate human-like responses. These advanced language models enable chatbots to carry on coherent and contextually relevant conversations.

LLMs are built using deep learning techniques and trained on enormous textual datasets so that they can excel in various language processing and understanding tasks. The data is processed through a neural network, specifically, a "transformer" architecture (note, the "T" in GPT stands for "transformer"). Here's a reasonably simple explanation of how it works: 

Specifically, a transformer can read vast amounts of text, spot patterns in how words and phrases relate to each other, and then make predictions about what words should come next. You may have heard LLMs being compared to supercharged autocorrect engines, and that's actually not too far off the mark: ChatGPT and Bard don't really “know” anything, but they are very good at figuring out which word follows another, which starts to look like real thought and creativity when it gets to an advanced enough stage.

Source: David Nield. “How ChatGPT and Other LLMs Work—and Where They Could Go Next.” Wired, April 30, 2023. https://www.wired.com/story/how-chatgpt-works-large-language-model/

Generative AI Tools

ChatGPT, a generative AI-powered chatbot, is currently the most popular tool based on an LLM. Many estimates have ChatGPT, a commercial software product from OpenAI, with around a 60% market share. We mentioned some other chatbots above, and there are other generative AI tools that go beyond conversational capabilities and focus more on search and research, like Perplexity, Consensus, and Elicit

If you have ever used ChatGPT, or one of the other AI chatbots, you probably know that it has a relatively simple interface. You enter a question as natural language (known as a prompt), and it generates text responses that approximate human language. There are all kinds of applications for this technology, from creative endeavors to research assistance (you'll read more about the ethical use of generative AI tools on the next page of this module). 

So, are Generative AI chatbots search engines? 

Just two years ago, it was easy to state that ChatGPT is not a search engine. Traditionally, a search engine like Google would seek out existing texts (in the form of webpages) and return them as a set of results, and LLM-enabled chatbots, like Claude and ChatGPT, would generate brand new text based on the probability of what is the most plausible response. Increasingly however, companies like Microsoft and Google are expending enormous effort to combine generative AI LLMs with search engines. You can already see this in the results that a Google search provides. Chapman University makes Microsoft's Copilot available to all students. Copilot adds the capabilities of an LLM (currently OpenAI's GPT-4o and GPT-4.1 models) to Microsoft's existing search engine Bing. Another example of integrating search capabilities with an LLM is the research-oriented tool Perplexity

Now that you have some foundation to understand what LLMs and generative AI are capable of, you're probably wondering how you can use these tools. Read on in the next page. 

 

What are some ethical ways college students can use generative AI tools, including

ChatGPT, Claude, Copilot, Gemini, or Perplexity

  • Generate ideas to get started with research or creative pursuits
  • Ask for keywords to simplify a search to get better search results
  • Ask for ways to expand a topic into new research directions
  • Ask for suggestions to improve your writing--such as grammar or tone--or to find weak spots in your research

All of these methods are useful in developing your research topic using a chat-enabled Large Language Model (LLM) or other generative AI tools. Notice that these tools can be ethically used in ways that support the entire research process. 

What are ways to use LLM-enabled chatbots or search engines that should be avoided? 

  • Asking the bot to write all or part of an essay
    • Not only is this unethical, but it probably won't be very good. The chatbot hasn't been in your class and doesn't know the context of what you're supposed to have learned throughout the semester.
  • Asking the bot to search for your sources
    • This isn't necessarily unethical, but it's not a recommended method. LLMs will often make up sources that look real but are actually non-existent. Increasingly, LLM chatbots--such as Copilot, Claude, or Perplexity--will point you to internet resources, but often they fail to return articles that are academic or scholarly. In many cases, their returned results are superficial or NOT as good as what you could find through human evaluation or through the use of library resources. Doing the research yourself will allow you to search creatively and will produce better results. Remember, LLMs do not explicitly contain facts. They are statistical models of how human languages work. Library research databases will lead you to facts.

Gray areas in the ethical use of generative AI: 

  • Simplify the language of a text in order to understand it better
    • While this is a powerful use of the technology to aid in learning, there may be copyright concerns if the bot incorporates copyrighted text into its knowledge base. There are some tools that do not include your uploaded materials or prompts in their training data. If you are interested in such tools, ask a librarian for guidance.

This page is mostly based on the following source:  Mollick, E. (2025, January 26). Against “Brain Damage” [Substack newsletter]. One Useful Thing. https://www.oneusefulthing.org/p/against-brain-damage  

Critical Thinking and Generative AI

Some people have argued that generative AI tools like ChatGPT make you dumber or damage your brain. The good news is that they don't really. But these tools do have the potential to negatively impact your learning, your creativity, and your critical thinking abilities.
According to Ethan Mollick (2025), an influential researcher on AI, the danger isn’t that AI makes you “dumber,” it’s that it offers a tempting shortcut that can cause you to skip the real thinking and learning that help you grow as a student and thinker.

A recent study from MIT's Media Lab found that when students use generative AI it can short-circuit the mental effort that leads to deeper understanding. That means that even when you’re using generative AI honestly (not just to cheat but to help with your work), you still have to be careful that it doesn’t backfire by preventing you from really learning. 

 

The Key: Use AI as a Tool, Not a Shortcut in Learning

Here are some tips from Mollick (2025) to make sure you are at the center of the learning and thinking process: 

  • Treat AI like a tutor, not a homework machine.
    • You may want to use specialized prompts (like the one linked here) that guide the AI to help you learn rather than just giving answers.
  • Don’t start with AI. Instead, do your own thinking first.
  • Start your own brainstorming and write down your ideas. Then, turn to AI to build on them or push your thinking further. Some examples:
      • “Combine ideas #3 and #7 in an extreme way,” “Even more extreme,” “Give me 10 more ideas like #42,” “Use superheroes as inspiration to make the idea even more interesting” (Mollick, 2025). 

 

Writing Is Thinking

As Mollick points out, many writers believe that the act of writing helps to think through ideas. If you let AI handle your writing, you may never figure out what you truly believe or how much you understand about a topic. 
 
Try using the following strategy to maximize your thinking while using AI: 

  • First, write out a full first draft without any help from AI.
  • Then, give the draft to AI and ask it to act as a reader or editor. Examples: 
      • Point out unclear writing or ideas.
      •  Improve the tone or language for different audiences.
      • Suggest ways to strengthen your writing, like alternative endings, clearer wording, or ways to make an argument stronger. (Mollick, 2025)

 

What’s Really at Stake

Mollick puts this well: 

AI doesn't damage our brains, but unthinking use can damage our thinking. What's at stake isn't our neurons but our habits of mind. There is plenty of work worth automating or replacing with AI (we rarely mourn the math we do with calculators), but also a lot of work where our thinking is important. ... Our fear of AI “damaging our brains” is actually a fear of our own laziness. … Your brain is safe. Your thinking, however, is up to you. (Mollick, 2025)

 

 


References: 

What is Data Literacy?

In our research and our daily lives, we constantly interact with data, but we rarely reflect on how well we understand it.

In order to critically consume, produce, and think about data, we need a basic framework for understanding it. Data literacy is our ability to interpret and understand data. On this page, we’ll briefly focus on a few related key concepts and some basic terminology.

We rarely encounter raw data in our everyday lives. Instead, we see data represented as numbers and charts that are meant to tell a data story or provide evidence for a claim.  

A single data point, or datum, can be almost anything: a measurement, an observation, a response to a survey question. Usually, a single data point doesn’t tell us much. But a collection of data points, or data, has the potential for us to make larger observations or draw conclusions about the information that we collected.

When we collect similar or related data together, we have a data set.

To illustrate these terms, here’s an example: 

"Suppose you were out hiking and you accidentally fell and broke your arm. Luckily, your friends are there to take you to the emergency room and help you fill out the mountain of paperwork the nurse hands you. Each answer you scribble down gives the doctor data they can use to decide how to help heal your arm. All the information together makes up a data set on you and your medical history.

Now some questions, like “Age?” or “Height?” give the doctor quantitative data or data that’s represented by numbers, or quantities.

But not everything about us can be written as a number. Like the question: how the heck did you break your arm? Information that’s describing qualities something has or a category it belongs to is called qualitative data. …

And after you make it through all the paperwork, the doctor might add data in other ways. If they take an x-ray, this photo becomes a data point in your medical history. Or they might record an audio clip of your heartbeat or video of an ultrasound. Your file has a long story to tell."

Source:  Arizona State University: (“What are Data and Data Literacy”)

Reading Visual Data

Data is more than just numbers: it can include images, text, or audio. Interpreting charts, graphs, and visuals is as important as reading numbers.

Visual literacy is the ability to read, interpret, and critically evaluate visual information, including but not limited to data representations such as charts, graphs, and maps.

Understanding visual data is key to spotting patterns, identifying misleading visuals, and making informed decisions.
When viewing a chart or graph,
here’s some starting points you can consider:

Infographic outlining how to interpret visual data. It is divided into three sections: (1) Evaluate the Data Representation – check chart title, type, and axis labels; (2) Analyze the Data Patterns – look for trends, outliers, and missing or exaggerated data; (3) Consider the Context and Interpretation – examine design choices, data source credibility, and data collection methods. Includes small icons of bar charts, scatter plots, pie charts, and magnifying glass.

We encounter data everywhere in our lives, yet we don't always possess the skills to confidently interpret the facts and figures we see in news stories. 

A common technique used in understanding a dataset is describing with measures of central tendency. One of the most common measures is the mean, which is often informally referred to as the average. Averages are useful in reporting because they summarize a large amount of data into a single value; and reflect there is variability around this single value within the data.

Knowing the definition of average and how it’s calculated allows you to understand that the number doesn’t reflect all items in a dataset. Understanding what is meant by average can help you to appreciate the importance of Data Literacy and will help you to comprehend the information presented to you in data journalism. 

This video is used with the permission of its creator, Genevieve Milliken, Data Services Librarian, NYU Health Sciences Library.

 

Transcript

Media outlets such as news channels, websites, and radio stations deliver massive amounts of information to the public every day. The sheer volume of it can be difficult to process.

Added to this is the reality that what is reported—by even the most reliable outlets—is the product of someone else’s analysis and interpretation of raw data.

So how can anyone hope to make sense of what a reported statistic means?

The answer is highly complex, but there are helpful places to start.

One way of making sense of it all is to understand the term average.

The average, or mean, is derived by adding together all the points in a series and dividing that by the total number of points.

If we begin with this series of unordered numbers, the average is 14.18. Notice that this number does not appear in the series but is derived from it.

In general, the average is the most common term used when talking about statistical data.

When a report states that the average household income is $85,000, it does not signify that the average person will earn $85,000 per year.

It means that when all the incomes in a sample collected are added together and then divided by the number of data points in the set, the number it derived is $85,000.

This number can be skewed upward or downward.

If the sample used to calculate income only included college students making around $5,000 per year, the number will be low.

If, however, the sample includes Stephen Kaufer, CEO of TripAdvisor, who made around $48 million last year, the final number will skew upward—pretty dramatically.

This becomes clear if we return to our numbered list.

The inclusion of three large numbers pushes the average higher than it would have been without them.

In other words, a story about income that includes very large paychecks has the potential to lead us to conclusions that may not be representative of what an average person—whatever that means—earns per year.

The lesson here is that it is important to understand that the average and statistics may not represent any one person or situation, but instead the particular dataset chosen.

How Visuals Can Mislead

"If you torture the data long enough, it will confess to anything" - economist Ronald Coase

A common misconception is that data is purely objective.  We see charts and naturally feel that the information we're seeing is truthful and persuasive. After all, it’s just numbers, right? But take caution!  As Alberto Cairo says in his book, How Charts Lie

Politicians, marketers, and advertisers throw numbers and charts at us with no expectation of our delving into them: the average family will save $100 a month thanks to this tax cut; the unemployment rate is at 4.5%, a historic low, thanks to our stimulus package; 59% of Americans disapprove of the president's performance; 9 out of 10 dentists recommend our toothpaste; there is a 20% chance of rain today; eating more chocolate may help you win the Nobel Prize (Cairo, xi). 

Source: Cairo, A. (2019). How charts lie: Getting smarter about visual information (First edition). W. W. Norton & Company, Inc.

In fact, sometimes even real, accurate data can be used to deceive.

 

Deceptive Data Example 1: Amazon Sales
Take a look at this chart of country of origin for production of items for sale on Amazon. What assumption might someone make when looking at these percentages?

Title: Made in China, Sold on AmazonSubtitle: Share of items sold on Amazon, by country of originThis horizontal bar chart shows the percentage of Amazon products sold that originate from different countries, based on a 2024 survey of 1,064 first- and third-party sellers. The chart is accompanied by a world map with highlighted countries.Data shown in the chart:China: 71%United States: 30%India: 14%Germany: 6%Mexico: 5%Japan: 5%Vietnam: 5%Other countries: 26%Note: Percentages total over 100% because sellers could select multiple source countries.A light orange shade highlights each country on the world map corresponding to its contribution. China is shaded most prominently, followed by the U.S., India, and others. The Amazon logo appears in the lower right corner.Source: Jungle Scout via ECDB, as published by Statista in 2024.

At first glance, it may appear that the percentages represent a breakdown of a whole, as is typical in bar charts. But the total exceeds 100% (71% + 30% + 14% + … = well over 150%).

If someone did not read or understand the footnote carefully, it might be wrongly assumed the percentages represent exclusive product sourcing from each country. In reality, survey respondents could select multiple countries, meaning many sellers source from more than one location.

A stacked or grouped bar chart would have been more effective in showing overlapping data selections and preventing misinterpretation. This example of a revised chart may be a more accurate data visualization:

This stacked bar chart shows the percentage of Amazon sellers who source products from each country. It distinguishes between sellers who source only from that country and those who source from it along with others. Countries are listed on the x-axis and percentages on the y-axis.

Deceptive Data Example 2: 2024 Presidential Election Maps

Here is another example of deceiving data visualizations:

This is a map of the 2024 Presidential Election results by land area of each county. Does it look like most of the country voted red or blue? 

 Title: 2024 Presidential Election Results by CountySubheading: Total area of each color: Red – 84.0%, Blue – 16.0%Description:A U.S. map visualizing the 2024 presidential election results at the county level using red and blue circles. Each circle represents a county and is sized by land area, not population.Red circles indicate counties won by the Republican candidate (Donald Trump).Blue circles indicate counties won by the Democratic candidate (Kamala Harris).Most of the map is dominated by red circles, especially in rural regions, giving the impression of a large Republican victory in terms of geography.Alaska and Hawaii are shown with similar styling in the lower left.The legend at the top notes that red represents 84% of the total land area, while blue represents 16%.A settings panel on the right side of the image includes toggle options for election years (2016, 2020, 2024), circle sizing (Land or Population), margin coloring, zoom options, and animation controls.Source: engaging-data.comData: County-level election data from The New York Times, last updated 11/20/24.   

The same dataset was used to plot another graph based on counties according to their population, with the largest bubbles being the highest populated counties. Now, does this data visualization change your interpretation for more red or blue votes?

Title: 2024 Presidential Election Results by CountySubheading: Total area of each color: Red – 49.0%, Blue – 51.0%Description:This U.S. map displays the 2024 presidential election results by county, using red and blue circles sized by county population. The visual represents how many people voted for each candidate, rather than the land area.Red circles indicate counties won by the Republican candidate (Donald Trump).Blue circles indicate counties won by the Democratic candidate (Kamala Harris).The larger blue circles—especially in major urban areas—make the map appear more evenly balanced or leaning blue.The chart notes that 51.0% of the population voted blue and 49.0% voted red, though this may reflect rounding rather than actual vote totals.Alaska and Hawaii are included at the bottom left with the same circle sizing method.A settings panel on the right includes options for:Selecting the election year (2016, 2020, 2024)Switching between “Land” and “Population” viewDisplay controls such as “Color by Margin,” “Show State Outlines,” and “Animate West to East”Source: engaging-data.comData: County-level election data from The New York Times Election API, last updated 11/20/2024.

 

In the first map, it appears that most of the country voted red. The map is dominated by red circles, suggesting a landslide victory. But this map shows land, not people. Large rural counties with low population density take up more visual space, skewing our perception of voter distribution.

In the second version of the map, this time sized by population, the size of each bubble now reflects how many people live in each county. High-population urban areas (often voting blue) appear much larger.

Each map tells a different story:

  • The land-based map suggests a strong red majority.
  • The population-based map shows a much closer presidential race, and gives the impression of a blue advantage.

In actuality, the 2024 popular vote was very close, with Trump winning the election. Actual 2024 results:

  • Donald Trump (Republican = red): 77,302,580 votes (49.8%)
  • Kamala Harris (Democrat = blue): 75,017,613 votes (48.3%)

This illustrates how even accurate data can be presented in ways that shape very different interpretations, depending on what is emphasized.

     Map Sources:

So, what's the point? 

You don’t need to be a data scientist to protect yourself from misleading data. But you can build some healthy skepticism in order to be an informed information consumer. Just remember, instead of accepting charts and statistics right away, try to question where the data came from, how they might have been collected, and whether they are accurately depicted in the source you're seeing. 

Visuals are powerful tools, but they can be used for persuasion just as much as for truth.

How can I Improve My Data Literacy?

There are an enormous number of introductory tutorials, videos, and webinars available to you through a quick internet search. If you would like to dig deeper into the subject, SAGE Campus gives Chapman University students the ability to create a free account to take demo courses. Some of the many available courses are listed below:

  • See Numbers in Data
  • Statistical Significance
  • Introduction to Data Visualisation

Where can I Find Data?

Check out the Leatherby Libraries guide on Data Sets and Resources to find all sorts of data for your research!

Check your understanding of Research by completing the practice quiz below. 

You may also open the quiz in a new tab or window using this link: AI Literacy and Data Literacy - FFC Practice Quiz