How do I plot only a table in Matplotlib? Reduce left and right margins in matplotlib plot. Here is how I do it from sqlite database using sqlite3, pandas and pdfkit. Well one way is to use markdown. You can use df. This converts the dataframe into a html table. From there you can put the generated html into a markdown file. There you can use an extension, search "markdown to pdf", which will make the conversion for you.
I did not use pdfkit, because I had some problems with it on a headless machine. But weasyprint is great. Stack Overflow for Teams — Collaborate and share knowledge with a private group. Create a free Team What is Teams? Collectives on Stack Overflow. Learn more. Asked 6 years, 3 months ago. Active 8 months ago. Viewed 89k times. What is an efficient way to generate PDF for data frames in Pandas? Add a comment. The PDF document contains eight basic types of objects described below.
These types are: booleans, numbers, strings, names, arrays, dictionaries, streams and the null object. Objects may be labeled so that they can be referenced by other objects. A labeled object is also called an indirect object. There are two keywords: true and false that represent the boolean values.
There are two types of numbers in a PDF document: integer and real. An integer consists of one or more digits optionally preceded by a plus or minus sign. An example of integer objects may be seen below:. The real value can be represented with one or more digits, with an optional sign and a leading, trailing or embedded decimal point a period. An example of real numbers can be seen below:. There is a limitation of the length of the name element, which may be only bytes long.
When writing a name, a slash must be used to introduce a name; the slash is not part of the name but is a prefix indicating that what follows is a sequence of characters representing the name. If we want to use whitespace or any other special character as part of the name, it must be encoded with two-digit hexadecimal notation. Figure 6: PDF names source. Strings in a PDF document are represented as a series of bytes surrounded by parenthesis or angle brackets, but can be a maximum of bytes long.
Any character may be represented by ASCII representation, and alternatively with octal or hexadecimal representations. Octal representation requires the character to be written in the form ddd, where ddd is an octal number.
An example of representing a string embedded in parentheses can be seen below:. We can also use special well-known characters when representing a string. Those are: n for new line, r for carriage return, t for horizontal tabulator, b for backspace, f for form feed, for left parenthesis, for right parenthesis and for backslash.
Arrays in PDF documents are represented as a sequence of PDF objects, which may be of different types and enclosed in square brackets. This is why an array in a PDF document can hold any object types, like numbers, strings, dictionaries and even other arrays. An array may also have zero elements. An array is presented with a square bracket. An example of an array is presented below:. The key must be the name object, whereas the value can be any object, including another dictionary.
The maximum number of entries in a dictionary is entries. A stream object is represented by a sequence of bytes and may be unlimited in length, which is why images and other big data blocks are usually represented as streams. A stream object is represented by a dictionary object followed by the keywords stream followed by newline and endstream.
The stream dictionary specifies the exact number of bytes of the stream. After the data there should be a newline and the endstream keyword. Common keywords used in all stream dictionaries are the following note that the Length entry is mandatory :. The stream data in the object stream will contain N pairs of integers, where the first integer represents the object number and the second integer represents the offset in the decoded stream of that object.
The First entry in the dictionary identifies the first object in the object stream. In PDF 1. Each cross-reference stream contains the information equivalent to the cross-reference table and trailer. First of all, we must know that any object in a PDF document can be labeled as an indirect object. This gives the object a unique object identifier, which other objects can use to reference the indirect object.
By declaring an object an indirect object, we are able to use it in the PDF document cross-reference table and reuse it by any page, dictionary and so on in the document. Since every indirect object has its own entry in the cross-reference table, the indirect objects may be accessed very quickly.
The object identifier of the indirect object consists of two parts; the first part is an object number of the current indirect object. The second part is the generation number, which is set to zero for all objects in a newly-created file.
This number is later incremented when the objects are updated. We can refer to the indirect objects with indirect reference, which consists of the object number, the generation number and the keyword R.
To reference the above indirect object, we must write something like below:. Most of the objects in a PDF document are dictionaries. Page objects are connected together and form a page tree, which is declared with an indirect reference in the document catalog. The whole structure of the PDF document can be represented with the picture below [1]:.
Figure 7: Structure of the PDF document source. In the picture above, we can see that the document catalog contains references to the page tree, outline hierarchy, article threads, named destinations and interactive form. From the picture above, we can see that the Document Catalog is the root of the objects in the PDF document. It also contains the information that declares how the document will be displayed on the screen. The entries in the document catalog are as follows:.
The reader can take a look at our sources for details. An example of the document catalog is presented below: 1 0 obj. The pages of the document are accessed through the page tree, which defines all the pages in the PDF document.
The tree contains nodes that represent pages of the PDF document, which can be of two types: intermediate and leaf nodes. Intermediate nodes are also called page tree nodes, while the leaf nodes are called page objects. The simplest page tree structure can consist of a single page tree node that references all of the page objects directly so all of the page objects are leafs.
Each node in a page tree has to have the following entries:. A basic example of a page tree can be seen below: 2 0 obj. We can also see that the leaves of the page tree are dictionaries specifying the attributes of a single page of the document.
There are multiple attributes that we can use when defining them for each document page. Figure 8: Simple document. We can see that the. We can compile the. The resulting PDF then looks like this shown in the picture below:. Figure 9: Result. Enter the email address you signed up with and we'll email you a reset link.
Need an account? Click here to sign up. Download Free PDF. Truc Mai. A short summary of this paper. Barrett George A. All rights reserved. No part of this book may be reproduced in any form, by photostat, microform, retrieval system, or any other means, without prior written permission of the publisher.
Lawrence Erlbaum Associates, Inc. Printed in the United States of America 1 0 9 8 7 6 5 4 3 2 1 Disclaimer: This eBook does not include the ancillary media that was packaged with the original printed version of the book. Table of Contents Preface vii 1. Several Measures of Reliability 63 Problem 4.
Multiple Regression 90 Problem 6. Logistic Regression and Discriminant Analysis Problem 7. It is intended to be a supplemental text in an intermediate statistics course in the behavioral sciences or education and it can be used in conjunction with any mainstream text. We have found that the book makes SPSS for windows easy to use so that it is not necessary to have a formal, instructional computer lab; you should be able to learn how to use SPSS on your own with this book.
Although SPSS for Windows is quite easy to use, there is such a wide variety of options and statistics that knowing which ones to use and how to interpret the printouts can be difficult, so this book is intended to help with these challenges. In fact, as far as the procedures demonstrated, in this book there are only a few major differences between versions 7 and We also expect future Windows versions to be similar.
You should not have much difficulty if you have access to SPSS versions 7 through 9. Our students have used this book, or earlier editions of it, with all of these versions of SPSS; both the procedures and outputs are quite similar. Goals of This Book This book demonstrates how to produce a variety of statistics that are usually included in intermediate statistics courses, plus some e. Helping you learn how to choose the appropriate statistics, interpret the outputs, and develop skills in writing about the meaning of the results are the main goals of this book.
Thus, we have included material on: 1 How the appropriate choice of a statistic is based on the design of the research. This information will help you develop skills that cover a range of steps in the research process: design, data collection, data entry, data analysis, interpretation of outputs, and writing results. The modified high school and beyond data set HSB used in this book is similar to one you might have for a thesis, dissertation, or research project.
Therefore, we think it can serve as a model for your analysis. The compact disk CD packaged with the book contains the HSB data file and several other data sets used for the extra problems at the end of each chapter.
However, you will need to have access to or purchase the SPSS program. Partially to make the text more readable, we have chosen not to cite many references in the text; however, we have provided a short bibliography of some of the books and articles that we have found useful. We assume that most students will use this book in conjunction with a class that has a textbook; it will help you to read more about each statistic before doing the assignments.
Our "For Further Reading" list should also help. Special Features Several user friendly features of this book include: 1.
The key SPSS windows that you see when performing the statistical analyses. This has been helpful to "visual learners. The complete outputs for the analyses that we have done so you can see what you will get, after some editing in SPSS to make the outputs fit better on the pages. Callout boxes on the outputs that point out parts of the output to focus on and indicate what they mean. For each output, a boxed interpretation section that will help you understand the output.
Specially developed flow charts and tables to help you select an appropriate inferential statistic and tell you how to interpret statistical significance and effect sizes in Chapter 3. This chapter also provides an extended example of how to identify and write a research problem, several research questions, and a results paragraph for a t test and correlation.
Interpretation questions that stimulate you to think about the information in the chapter and outputs. Answers to the odd numbered interpretation questions Appendix D. Several data sets on a CD. These realistic data sets are packaged with the book to provide you with data to be used to solve the chapter problems and the extra problems at the end of each chapter.
Overview of the Chapters Our approach in this book is to present how to use and interpret SPSS in the context of proceeding as if the HSB data were the actual data from your research project.
However, before starting the SPSS assignments, we have three introductory chapters. The first chapter is an introduction and review of research design and how it would apply to analyzing the HSB data. In addition chapter includes a review of measurement and descriptive statistics.
Chapter 2 discusses rules for coding data, exploratory data analysis EDA , and assumptions. Much of what is done in this chapter involves preliminary analyses to get ready to answer the research questions that you might state in a report. Chapter 3 provides a brief overview of research designs between groups and within subjects. This chapter provides flowcharts and tables useful for selecting an appropriate statistic. Also included is an overview of how to interpret and write about the results of a basic inferential statistic.
This section includes not only testing for statistical significance but also a discussion of effect size measures and guidelines for interpreting them. Solving the problems in these chapters should give you a good idea of some of the intermediate statistics that can be computed with SPSS. In addition, it is our hope that interpreting what you get back from the computer will become more clear after doing these assignments, studying the outputs, answering the interpretation questions, and doing the extra SPSS problems.
Our Approach to Research Questions, Measurement, and Selection of Statistics In Chapters 1 and 3, our approach is somewhat nontraditional because we have found that students have a great deal of difficulty with some aspects of research and statistics but not others.
Most can learn formulas and "crunch" the numbers quite easily and accurately with a calculator or with a computer. However, many have trouble knowing what statistics to use and how to interpret the results. They do not seem to have a "big picture" or see how research design and measurement influence data analysis.
Part of the problem is inconsistent terminology. For these reasons, we have tried to present a semantically consistent and coherent picture of how research design leads to three basic kinds of research questions difference, associational, and descriptive which, in turn, lead to three kinds or groups of statistics with the same names. We realize that these and other attempts to develop and utilize a consistent framework are both nontraditional and somewhat of an oversimplification.
However, we think the framework and consistency pay off in terms of student understanding and ability to actually use statistics to answer their research questions. Instructors who are not persuaded that this framework is useful can skip Chapters 1 and 3 and still have a book that helps their students use and interpret SPSS.
Major Changes and Additions to This Edition The following changes and additions are based on our experiences using the book with students, feedback from reviewers and other users, and the revisions in policy and best practice specified by the APA Task Force on Statistical Inference and the 5th Edition of the APA Publication Manual Effect size. We discuss effect size in addition to statistical significance in the interpretation sections to be consistent with the requirements of the revised APA manual.
Because SPSS does not provide effect sizes for all the demonstrated statistics, we often show how to estimate or compute them by hand. Writing about outputs. We have found the step from interpretation to writing quite difficult for students so we now put more emphasis on writing. When each statistic is introduced, we have a brief section about its assumptions and when it is appropriate to select that statistic for the problem or question at hand. Testing assumptions.
We have expanded emphasis on exploratory data analysis EDA and how to test assumptions. We have condensed several of the appendixes of the first edition into the alphabetically organized Appendix A, which is somewhat like a glossary. Extra SPSS problems. We have developed additional extra problems, to give you more practice in running and interpreting SPSS. Reliability assessment. We include a chapter on ways of assessing reliability including Cronbach's alpha, Cohen's kappa, and correlation.
More emphasis on reliability and testing assumptions is consistent with our strategy of presenting SPSS procedures that students would use in an actual research project. We have added a section on exploratory factor analysis to increase students' choices when using these types of analyses. Interpretation questions. We have added more interpretation questions to each chapter because we have found them useful for student understanding.
For example: Highlight gender and math achievement. Click on the arrow to move the variables into the right hand box. Click on Options to get Fig 2. Click on Continue. Note that the words in italics are variable names and words in bold are words that you will see in the SPSS Windows and utilize to produce the desired output. In the text they are spelled and capitalized as you see them in the Windows. Bold is also used to identify key terms when they are introduced, defined, or important to understanding.
The words you will see in the pull down menus are given in bold with arrows between them. Occasionally, we have used underlines to emphasize critical points or commands. In fact, some sections of chapters 1 and 3 have been only slightly modified from that text. For this we thank Jeff Gliner, the first author of that book. Although Orlando Griego is not an author on this revision of our SPSS book, it still shows the imprint of his student friendly writing style.
We would like to acknowledge the assistance of the many students in our education and human development classes who have used earlier versions of this book and provided helpful suggestions for improvement. We could not have completed the task or made it look so good without our technology consultant, Don Quick, our word processors, Linda White and Catherine Lamana, and several capable work study students including Rae Russell, Katie Jones, Erica Snyder, and Jennifer Musser.
Joan Clay and Don Quick wrote helpful appendices for this edition. We also acknowledge the financial assistance of two instructional improvement grants from the College of Applied Human Sciences at Colorado State University. Finally, the patience of our families enabled us to complete the task, without too much family strain.
First, we provide a brief review of some key terms, as we will use them in this book. Variables Variables are key elements in research. A variable is defined as a characteristic of the participants or situation for a given study that has different values in that study. A variable must be able to vary or have different values or levels. Age is a variable that has a large number of values. Number of days to learn something or to recover from an ailment are common measures of the effect of a treatment and, thus, are also variables.
Similarly, amount of mathematics knowledge is a variable because it can vary from none to a lot. If a concept has only one value in a particular study, it is not a variable; it is a constant.
Thus, ethnic group is not a variable if all participants are European American. Gender is not a variable if all participants in a study are female.
In quantitative research, variables are defined operationally and are commonly divided into independent variables active or attribute , dependent variables, and extraneous variables.
Each of these topics will be dealt with briefly in the following sections. Operational definitions of variables. An operational definition describes or defines a variable in terms of the operations or techniques used to make it happen or measure it. When quantitative researchers describe the variables in their study, they specify what they mean by demonstrating how they measured the variable.
Demographic variables like age, gender, or ethnic group are usually measured simply by asking the participant to choose the appropriate category from a list.
Types of treatment or curriculum are usually operationally defined much more extensively by describing what was done during the treatment or new curriculum. To do this, the investigator may provide sample questions, append the actual instrument, or provide a reference where more information can be found. Independent Variables In this book, we will refer to two types of independent variables: active and attribute. It is important to distinguish between these types when we discuss the results of a study.
Sometimes italics are also used to emphasize a word. We have put in bold the terms used in the SPSS windows and outputs e. Underlines are used to emphasize critical points. Bullets precede instructions about SPSS actions e.
An active independent variable is a variable, such as a workshop, new curriculum, or other intervention, one level of which is given to a group of participants, within a specified period of time during the study. For example, a researcher might investigate a new kind of therapy compared to the traditional treatment.
A second example might be to study the effect of a new teaching method, such as cooperative learning, on student performance. In these two examples, the variable of interest was something that was given to the participants. Thus, active independent variables are given to the participants in the study but are not necessarily given or manipulated bv the experimenter. They may be given by a clinic, school, or someone other than the investigator, but from the participants' point of view, the situation was manipulated.
Using this definition, the treatment is usually given after the study was planned so that there could have been or preferably was a pretest. Other writers have similar but, perhaps, slightly different definitions of active independent variables. An active independent variable is a necessary but not sufficient condition to make cause and effect conclusions; the clearest causal conclusions can be drawn when participants are assigned randomly to conditions that are manipulated by the experimenter.
Attribute or measured independent variables. A variable that cannot be manipulated, yet is a major focus of the study, can be called an attribute independent variable. In other words, the values of the independent variable are preexisting attributes of the persons or their ongoing environment that are not systematically changed during the study.
Studies with only attribute independent variables are called nonexperimental studies. In keeping with SPSS, but unlike authors of some research methods books, we do not restrict the term independent variable to those variables that are manipulated or active. We define an independent variable more broadly to include any predictors, antecedents, or presumed causes or influences under investigation in the study. Attributes of the participants, as well as active independent variables, fit within this definition.
For the social sciences and education, attribute independent variables are especially important. Type of disability or level of disability may be the major focus of a study. Disability certainly qualifies as a variable since it can take on different values even though they are not given during the study. For example, cerebral palsy is different from Down syndrome, which is different from spina bifida, yet all are disabilities.
Also, there are different levels of the same disability. People already have defining characteristics or attributes that place them into one of two or more categories. The different disabilities are already present when we begin our study. Thus, we might also be interested in studying a class of variables that are not given or manipulated during the study, even by other persons, schools, or clinics. Other labels for the independent variable.
SPSS uses a variety of terms in addition to independent variable; for example, factor chapters 8,9, and 10 , and covariates chapter 7. In other cases, chapters 4 and 5 SPSS and statisticians do not make a distinction between the independent and dependent variable; they just label them variables. Values of the independent variable.
SPSS uses the term values to describe the several options or values of a variable. These values are not necessarily ordered, and several other terms, categories, levels, groups, or samples are sometimes used interchangeably with the term values, especially in statistics books.
Suppose that an investigator is performing a study to investigate the effect of a treatment. A second group does not receive the treatment. The study could be conceptualized as having one independent variable treatment type , with two values or levels treatment and no treatment.
The independent variable in this example would be classified as an active independent variable. The study still would be conceptualized as having one active independent variable treatment type , but with three values or levels the two treatment conditions and the control condition.
As an additional example, consider gender, which is an attribute independent variable with two values, as male and female. Note that in SPSS each variable is given a variable label; moreover, the values, which are often categories, have value labels e. Each value or level is assigned a number used by SPSS to compute statistics. It is especially important to know the value labels when the variable is nominal i. Dependent Variables The dependent variable is assumed to measure or assess the effect of the independent variable.
It is thought of as the presumed outcome or criterion. Dependent variables are often test scores, ratings on questionnaires, readings from instruments electrocardiogram, galvanic skin response, etc.
When we discuss measurement, we are usually referring to the dependent variable. Dependent variables, like independent variables must have at least two values; most dependent variables have many values, varying from low to high.
SPSS also uses a number of other terms in addition to dependent variable. Dependent list is used in cases where you can do the same statistic several times, for a list of dependent variables e. Grouping variable is used in chapter 7 for discriminant analysis. Extraneous Variables These are variables also called nuisance variables or, in some designs, covariates that are not of primary interest in a particular study but could influence the dependent variable.
Environmental factors e. SPSS does not use the term extraneous variable. However, sometimes such variables are controlled using statistics that are available in SPSS. Research Hypotheses and Research Questions Research hypotheses are predictive statements about the relationship between variables. Research questions are similar to hypotheses, except that they do not entail specific predictions and are phrased in question format.
For example, one might have the following research question: "Is there a difference in students' scores on a standardized test if they took two tests in one day versus taking only one test on each of two days? The figure also shows the general and specific purposes and the general types of statistics for each of these three types of research question. Difference research questions. This type of question attempts to demonstrate that groups are not the same on the dependent variable.
Associational research questions are those in which two or more variables are associated or related. This approach usually involves an attempt to see how two or more variables covary as one grows larger, the other grows larger or smaller or how one or more variables enables one to predict another variable. Descriptive research questions are not answered with inferential statistics.
They merely describe or summarize data, without trying to generalize to a larger population of individuals. Figure 1. Schematic diagram showing how the purpose and type of research question correspond to the general type of statistic used in a study. Also we wanted to distinguish between correlation, as a specific statistical technique, and the broader type of associational question and that group of statistics.
We think it is educationally useful to divide inferential statistics into two types, corresponding to difference and associational hypotheses or questions. Associational inferential statistics test for associations or relationships between variables and use, for example, correlation or multiple regression analysis.
We will utilize this contrast between difference and associational inferential statistics in chapter 3 and later in this book. Remember that research questions are similar to hypotheses, but they are stated in question format. We think it is advisable to use the question format when one does not have a clear directional prediction and for the descriptive approach. As implied by Fig.
Complex Research Questions Most research questions posed in this book involve more than two variables at a time. We call such questions and the appropriate statistics complex. Some of these statistics are called multivariate in other texts, but there is not a consistent definition of multivariate in the literature. We provide examples of how to write complex research questions in the chapter pertaining to each complex statistic.
We will see, in chapter 8, that although you do one factorial ANOVA, there are actually three or more research questions. This set of three questions can be considered a complex difference question because the study has two independent variables. Likewise, complex associational questions are used in studies with more than one independent variable considered together. Table 1. The table also includes references to other chapters in this book and examples of the types of statistics that we include under each of the six types of questions.
It is based on a national sample of data from more than 28, high school students. The current data set is a sample of 75 students drawn randomly from the larger population.
The data that we have for this sample includes school outcomes such as grades and the number of mathematics courses of different types that the students took in high school. Also, there are several kinds of standardized test data and demographic data such as gender and mother's and father's education.
To provide an example 3 We realize that all parametric inferential statistics are relational, so this dichotomy of using one type of data analysis procedure to test for differences when there are a few values or levels of the independent variables and another type of data analysis procedure to test for associations when there are continuous independent variables is somewhat artificial. Both continuous and categorical independent variables can be used in a general linear model approach to data analysis.
However, we think that the distinction is useful because most researchers utilize the above dichotomy in selecting statistics for data analysis. These data were developed for this book and, thus, are not really the math attitudes of the 75 students in this sample; however, they are based on real data gathered by one of the authors to study motivation.
These inclusions enable us to do some additional statistical analyses. Usually two or a few independent Table 3. Usually at least five correlation tested for significance ordered levels for both variables.
Often they are continuous. Often five or more ordered levels for all variables but some or all can be dichotomous variables. Note: Many studies have more than one dependent variable. It is common to treat each one separately i. However, there are complex statistics e. The Research Problem Imagine that you are interested in the general problem of what factors seem to influence mathematics achievement at the end of high school.
You might have some hunches or hypotheses about such factors based on your experience and your reading of the research and popular literature.
Some factors that might influence mathematics achievement are commonly called demographics: for example, gender, ethnic group, and mother's and father's education. A probable influence would be the mathematics courses that the student has taken. Such variables could influence what courses one took, the grades one received, and might be correlates of the demographic variables. We might wonder how spatial performance scores, such as pattern or mosaic pattern test scores and visualization scores might enable a more complete understanding of the problem, and whether these skills seem to be influenced by the same factors as math achievement.
The HSB Variables Before we state the research problem and questions in more formal ways, we need to step back and discuss the types of variables and the approaches that might be used to study the above problem. The primary dependent variable. Given the above research problem which focuses on mathematics achievement at the end of the senior year, the primary dependent variable is math achievement. Independent and extraneous variables. The number of math courses taken up to that point is best considered to be an antecedent or independent variable in this study.
What about father's and mother's education and gender1? How would you classify gender and parents' education in terms of the type of variable? What about grades'? Like the number of math courses, these variables would usually be considered independent variables because they occurred before the math achievement test.
However, some of these variables, specifically parental education, might be viewed as extraneous variables that need to be "controlled. Note that student's class is a constant and is not a variable in this study because all the participants are high school seniors i. Types of independent variables. As we discussed previously, independent variables can be active given to the participant during the study or manipulated by the investigator or attributes of the participants or their environments.
Are there any active independent variables in this study? There is no intervention, new curriculum, or similar treatment. All the independent variables, then, are attribute variables because they are attributes or characteristics of these high school students.
Given that all the independent variables are attributes, the research approach cannot be experimental. This means that we will not be able to draw definite conclusions about cause and effect i. Now we will examine the hsbdataB. We have provided a CD that contains the data for each of the 75 participants on 45 variables.
The variables in the hsbdataB. The CD in this book contains several SPSS data files for you to use, but it does not include the actual SPSS program, which you will have to have access to in order to do the assignments. When you open this file and click 4 We have decided to use the short version of mathematics i.
What is included in the variable view screen is described in more detail in Appendix B, Getting Started. Here, focus on the Name, Label, Values, and Missing columns. Name is a short name for each variable e. Label is a longer label for the variable e. The Values column contains the value labels, but you can see only the label for one value at a time e.
None just means that there are no special missing values, just the usual SPSS system missing value, which is a blank. Ffc hFig. Part of the hsbdataB. The variables of ethnic and religion were added to the original HSB data set to provide true nominal unordered variables with a few 4 and 3 levels or values.
In addition, for ethnic and religion, we have made two missing value codes to illustrate this possibility. All other variables use blanks, the SPSS system missing value, for missing data.
For ethnicity, 98 indicates multiethnic and other. For religion, all the high school students who were not protestant or catholic or said they had no religion were coded 98 and considered to be missing because none of the other religions had enough members to make a reasonable size group.
Those who left the ethnicity or religion questions blank were coded as 99, also missing. In SPSS 12, it can be longer, but we recommend that you keep it short. SPSS names must start with a letter and must not contain blank spaces or certain special characters e.
This is a test of pattern recognition ability involving the detection of relationships in patterns of tiles. Notice the short hvariable names e. Be aware that the participants are listed down the left side of the page, and the variables are always listed across the top. You will always enter data this way.
If a variable is measured more than once, such as visual and visual2 see Fig 1. Note that in Fig. Notice also that some cells, like father's education for participant 5, are blank because a datum is missing. Perhaps participant 5 did not know her father's education. In this section, we list some of the possible questions you might ask in order to give you an idea of the range of types of questions that one might have in a typical research project like a thesis or dissertation.
For review, we start with basic descriptive questions and some questions about assumptions. These statistics are not discussed in this book but how to compute them can be found in the Quick Reference Guide Appendix A. Finally, we pose a number of complex questions that can be answered with the statistics discussed in this book.
Thus, we could answer, with the outputs in chapter 2, the following basic descriptive questions: "What is the average educational level of the fathers of the students in this sample? One question is "Are the frequency distributions of the math achievement scores markedly skewed; that is, different from the normal curve distribution?
In chapter 4, correlation is also used to assess reliability. The example discussed here and throughout the book is based on 13 variables obtained from a random sample of 75 out of 28, high school seniors.
These variables include achievement scores, grades, and demographics. The raw data for the 13 variables were slightly modified from data in an appendix in Hinkle, Wiersma, and Jurs That file had no missing data, which is unusual in behavioral science research, so we made some. This complex associational question can be answered with multiple regression, as discussed in chapter 6. If the dependent variable, math achievement, were dichotomous high vs.
This introduction to the research problem and questions raised by the HSB data set should help make the assignments meaningful, and it should provide a guide and examples for your own research. Frequency Distributions Frequency distributions are critical to understanding our use of measurement terms.
We begin this section with a discussion of frequency distributions and two examples. Frequency tables and distributions can be used whether the variable involved has ordered or unordered levels SPSS calls them values. In this section, we will only consider variables with many ordered values, ranging from low to high.
A frequency distribution is a tally or count of the number of times each score on a single variable occurs. For example, the frequency distribution of final grades in a class of 50 students might be 7 As, 20 Bs, 18 Cs, and 5 Ds.
Note that in this frequency distribution most students have Bs or Cs grades in the middle and similar smaller numbers have As and Ds high and low grades. When there are small numbers of scores for the low and high values and most scores are for the middle values, the distribution is said to be approximately normally distributed. We will discuss the normal curve in more detail later in this chapter. When the variable is continuous or has many ordered levels or values , the frequency distribution usually is based on ranges of values for the variable.
For example, the frequencies number of students , shown by the bars in Fig 1. Similar small numbers of students have very low and very high scores. Thus, the frequency distribution of the SAT math scores is said to be approximately normal. A grouped frequency distribution for SAT math scores. Notice that the bars form a pattern very different from the normal curve line.
This distribution can be said to be not normally distributed. As we will see later in the chapter, the distribution is negatively skewed. That is, the tail of the curve or the extreme scores are on the low end or left side. Note how much this differs from the SAT math score frequency distribution.
As you will see in the Measurement section below , we call the competence scale variable ordinal. You can create these figures yourself using the hsbdataB. A grouped frequency distribution for the competence scale. Levels of Measurement Measurement is the assignment of numbers or symbols to the different characteristics values of variables according to rules.
In order to understand your variables, it is important to know their level of measurement. Depending on the level of measurement of a variable, the data can mean different things. To help understand these differences, types or levels of variables have been identified. It is common and traditional to discuss four levels or scales of measurement: nominal, ordinal, interval, and ratio, which vary from the unordered nominal to the highest level ratio.
These four traditional terms are not the same as those used in SPSS, and we think that they are not always the most useful for determining what statistics to use.
SPSS uses three terms nominal, ordinal, and scale for the levels of types or measurement. How these correspond to the traditional terms is shown in Table 1. When you name and label variables in SPSS, you have the opportunity to select one of these three types of measurement see Fig 1. Although what you choose does not affect what SPSS does in most cases, an appropriate choice both indicates that you understand your data and may help guide your selection of statistics.
We believe that the terms nominal, dichotomous, ordinal, and approximately normal for normally distributed are usually more useful than the traditional or SPSS measurement terms for the selection and interpretation of statistics. In part, this is because statisticians disagree about the usefulness of the traditional levels of measurement in determining the appropriate selection of statistics. Furthermore, our experience is that the traditional terms are frequently misunderstood and applied inappropriately by students.
0コメント