Chapter 1 released

A PDF of a Chapter 1 draft has been posted to the website (see downloads). This chapter focuses on data basics. In particular, this chapter includes

  • variable types,
  • bias in data,
  • data structure,
  • graphing data, and
  • types of studies.

We will re-release Chapter 1 in the future with an R supplement.

We have omitted the following topics for the following reasons:

  • Stem-and-leaf diagrams: this method is not used in practice.
  • Sampling methods (beyond SRS): these will be introduced later when students understand the motivation for stratified, cluster, and multistage sampling.
  • Blocking: as with sampling, it is difficult for students to understand the purpose of blocking without discussing modeling.
  • Z scores: we postpone this topic until Chapter 3, where we discuss modeling and Z scores become important.

We look forward to reading your comments below on Chapter 1.


2 comments so far

  1. Ista Zahn on

    I have to question the decision to omit presenting stem-and-leaf displays. These displays are in my opinion superior to histograms, because they contain more information in the same space. There may be other reasons for not discussing stem-and-leaf displays, but “they are not used in practice” does not strike me as a good one. If they are omitted from introductory textbooks like this one, chances are that their use will decline even further — not because they are inferior, but simply because textbook authors are following the trend instead of shaping them.

    • David Diez on

      It may have been useful to also note the reasons why stem and leaf (S&L) plots are falling out with the community. Two reasons come to mind that explain the trend: versatility and simplicity. When a S&L plot is appropriate, so also is a histogram. But there are many instances where a basic S&L plot is not useful but a histogram is entirely appropriate**. Secondly, S&L plots are visually complex and may distract from the overall picture of the data. While individual data points are important, plots are usually used to understand the global view of the data and the additional visual complexity of S&L plots can be distracting from this goal.

      S&L plots do have benefits: they contain additional information about the data and naturally lead into histograms. There are instances where S&L plots do provide insights that histograms do not but these tend to be special cases that are already omitted from an introductory course. We gave greater pause to the consideration that S&L plots may help introduce histograms. However, we decided histograms were simple enough that we could introduce them without S&L plots.

      ** Simple S&L plotting methods breaks down in the following examples:
      (a) Data with large values and/or high precision, e.g. 543 52 913 1523 287 … 348 931 538.
      (b) Any data set with more than about 75 observations.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s