• No results found

Central Limit Theorem

N/A
N/A
Protected

Academic year: 2022

Share "Central Limit Theorem"

Copied!
44
0
0

Loading.... (view fulltext now)

Full text

(1)

UNIT – 4

SAMPLING, HYPOTHESIS TESTING AND DATA PREPARATION

Dr. Mohd. Sarwar Alam Asst. Professor

Dept. of Business Administration

(2)

SYLLABUS

Sampling Theory, Designs and Issues

Central Limit Theorem

Hypothesis Testing-Concept & Procedures

Data Preparation Process

(3)

SAMPLING

The process of choosing a sample out of a population on the basis of some criteria is known as sampling.

The criteria depends upon the information required by the researcher.

What is sample?

Answer: It is a part of the population selected in such a way that;

1. It can represent the characteristics of the entire population.

(4)

SAMPLING DESIGN PROCESS

1. Define the Population 1. Define the Population

2. Determine the Sampling Frame 2. Determine the Sampling Frame

3. Select Sampling Technique(s) 3. Select Sampling Technique(s)

4. Determine the Sample Size 4. Determine the Sample Size

5. Execute the Sampling Process 5. Execute the Sampling Process

(5)

CONTINUED

Define the Population/Target Population

The target population is the collection of elements or objects that possess the information sought by the researcher and about which inferences are to be made.

The target population should be defined in terms of elements, sampling units, extent, and time.

An element is the object about which or from which the information is desired, e.g., the respondent.

A sampling unit is an element, or a unit containing the element, that is available for selection at some stage of the sampling process.

(6)

CONTINUED

Determine the Sampling Frame

The list or map identifying each and every sampling units. It is the exhaustive list of sampling unit.

The list may be either in tabular form or in the form of maps.

(7)

CONTINUED

Determine the Sample Size

Important qualitative factors in determining the sample size are:

I. The nature of the research

II. The number of variables

III. The nature of the analysis

Sample sizes used in similar studies

(8)

CONTINUED

Sample Sizes Used in Marketing Research Studies

Type of Study Minimum Size Typical Range

Problem identification research

(e.g. market potential) 500 1,000-2,500

Problem-solving research (e.g.

pricing) 200 300-500

Product tests 200 300-500

Test marketing studies 200 300-500

TV, radio, or print advertising (per

commercial or ad tested) 150 200-300

Test-market audits 10 stores 10-20 stores

Focus groups 2 groups 6-15 groups

(9)

CLASSIFICATION OF SAMPLING TECHNIQUES

Sampling Techniques

Non-Probability Sampling Techniques

Probability

Sampling Techniques

Convenience Sampling

Judgmental Sampling

Quota Sampling

Snowball Sampling

(10)

CONVENIENCE SAMPLING

Convenience sampling attempts to obtain a sample of convenient elements.

Often, respondents are selected because they happen to be in the right place at the right time.

Use of students, and members of social organizations.

Mall intercept interviews without qualifying the respondents

“People on the street” interviews

(11)

CONTINUED

A B C D E

1 6 11 16 21

2 7 12 17 22

3 8 13 18 23

4 9 14 19 24

Group D happens to assemble at a convenient time and place. So all the elements in this Group are selected. The resulting sample consists of elements 16, 17, 18, 19 and 20. Note, no elements are selected from group A, B, C and E.

(12)

JUDGMENTAL SAMPLING

Judgmental sampling is a form of convenience sampling in which the population elements are selected based on the judgment of the researcher.

Test markets

Purchase engineers selected in industrial marketing research

Expert witnesses used in court

(13)

CONTINUED

A B C D E

1 6 11 16 21

2 7 12 17 22

3 8 13 18 23

4 9 14 19 24

The researcher considers groups B, C and E to be typical and convenient. Within each of these groups one or two elements are selected based on typicality and convenience. The resulting sample consists of elements 8, 10, 11, 13, and 24.

Note, no elements are selected from groups A and D.

(14)

QUOTA SAMPLING

Quota sampling may be viewed as two-stage restricted judgmental sampling.

I. The first stage consists of developing control categories, or quotas, of population elements.

II. In the second stage, sample elements are selected based on convenience or judgment.

Control Variable

Population Composition

Sample Composition

Gender Percentage Percentage Number

1.Male 48 48 480

2. Female 52 52 520

100 100 1000

(15)

CONTINUED

A B C D E

1 6 11 16 21

2 7 12 17 22

3 8 13 18 23

4 9 14 19 24

A quota of one element from each group, A to E, is imposed. Within each group, one element is selected based on judgment or convenience.

The resulting sample consists of elements 3, 6, 13, 20 and 22. Note, one element is selected from each column or group.

(16)

SNOWBALL SAMPLING

1. In snowball sampling, an initial group of respondents is selected, usually at random.

2. After being interviewed, these respondents are asked to identify others who belong to the target population of interest.

3. Subsequent respondents are selected based on the referrals.

(17)

CONTINUED

Random

Selection Referrals

A B C D E

1 6 11 16 21

2 7 12 17 22

3 8 13 18 23

4 9 14 19 24

Elements 2 and 9 are selected randomly from groups A and B. Element 2 refers elements 12 and 13. Element 9 refers

element 18. The resulting sample consists of elements 2, 9, 12, 13, and 18. Note, there are no elements from group E.

(18)

SIMPLE RANDOM SAMPLING

Each element in the population has a known and equal probability of selection.

Each possible sample of a given size (n) has a known and equal probability of being the sample actually selected.

This implies that every element is selected independently of every other element.

It can be applied when the population is more or less homogeneous.

(19)

CONTINUED

Using a table of random numbers to select a sample (simple random sampling)

A Researcher wants to randomly select 10 students out of the class. How to do it using random number table?

1. Step 1: Assign a unique number to each member of the class (population).

Name & Number.

2. Step 2: Select any starting point in the random number table and find the first number that corresponds to the starting point.

3. Step 3: Move to the next number in the same row and choose the person in the sample.

(20)

SYSTEMATIC SAMPLING

The sample is chosen by selecting a random starting point and then picking every ‘ith’ element in succession from the sampling frame.

The sampling interval, ‘i’, is determined by dividing the population size N by the sample size n and rounding to the nearest integer.

For example, there are 10,000 elements in the population and a sample of 1000 is desired. In this case the sampling interval, i, is 10. A random number between 1 and 10 is selected. If, for example, this number is 3, the sample consists of elements 3, 13, 23, 33, 43, 53, and so on.

(21)

STRATIFIED SAMPLING

A two-step process in which the population is partitioned into subpopulations, or strata.

The strata should be mutually exclusive and collectively exhaustive in that every population element should be assigned to one and only one stratum and no population elements should be omitted.

The elements within a stratum should be as homogeneous as possible, but the elements in different strata should be as heterogeneous as possible.

(22)

CONTINUED

A major objective of stratified sampling is to increase precision without increasing cost.

Stratified sampling can be of two types; 1. Proportionate stratified sampling, and 2. Disproportionate stratified sampling.

In proportionate stratified sampling, the size of the sample drawn from each stratum is proportionate to the relative size of that stratum in the total population.

In disproportionate stratified sampling, the size of the sample from each stratum is not proportionate to the relative size of that stratum in the total population.

(23)

CLUSTER SAMPLING

The target population is first divided into mutually exclusive and collectively exhaustive subpopulations, or clusters.

Then a random sample of clusters is selected, based on a probability sampling technique such as simple random sampling.

For each selected cluster, either all the elements are

(24)

CENTRAL LIMIT THEOREM

It is a statistical theory which states that given a sufficiently large sample from a population with a finite level of variance, the mean of all samples from the same population will be approximately equal to mean of the population.

Illustration:

Lets assume, a population of size N has mean=μ, standard deviation= and variance = 2.

Choose a random sample of size n with elements x1, x2, x3,……xn. Do this repeatedly drawing many samples from the population and calculate mean ( X) for each sample. We will get the mean values (X1, X2,…….). It will provide another distribution known as sampling distribution of means.

(25)

CONTINUED

The sampling distribution of mean will approach a normal distribution with a mean μ and variance 2/n. 

Hence, according to Central Limit Theorem, the sampling distribution of the sample means has a

(26)

HYPOTHESIS TESTING

Draw Marketing Research Conclusion

Formulate H0 and H1

Select Appropriate Test

Choose Level of Significance

Determine Probability Associated with Test Statistic

Determine Critical Value of Test Statistic TSCR

Determine if TSCAL falls into (Non)

Rejection Region

Compare with Level of

Significance, Reject or Do not Reject H0

Collect Data and Calculate Test Statistic

(27)

STEP 1: FORMULATE THE HYPOTHESIS

A null hypothesis is a statement of the status quo, one of no difference or no effect. If the null hypothesis is not rejected, no changes will be made.

An alternative hypothesis is one in which some difference or effect is expected. Accepting the alternative hypothesis will lead to changes in opinions or actions.

The null hypothesis refers to a specified value of the

(28)

CONTINUED

In marketing research, the null hypothesis is formulated in such a way that its rejection leads to the acceptance of the desired conclusion. The alternative hypothesis represents the conclusion for which evidence is sought.

H0: £ 0.40

H

1

: > 0. μ 40

μ

(29)

STEP 2: SELECT AN APPROPRIATE TEST

The test statistic measures how close the sample has come to the null hypothesis.

The test statistic often follows a well-known distribution, such as the normal, t, or chi-square distribution.

When testing a hypothesis of proportion, we use z- statistic or z-test and when testing a hypothesis of mean, we use the z or t-statistic according to the

(30)

CONTINUED

1. If the population standard deviation ( ) is known and  either the data is normally distributed or the sample size n>30, we use the normal distribution i.e. z-test.

2. If the population standard deviation ( ) is unknown  and either the data is normally distributed or the sample size n<30, we use the t-test.

(31)

STEP 3: CHOOSE A LEVEL OF SIGNIFICANCE Type I Error

Type I error occurs when the sample results lead to the rejection of the null hypothesis when it is in fact true.

The probability of type I error ( ) is also called the level of significance.

Type II Error

Type II error occurs when, based on the sample results, the null hypothesis is not rejected when it is in fact false.

The probability of type II error is denoted by .

Unlike , which is specified by the researcher, the

(32)

STEP 4: COLLECT DATA AND CALCULATE TEST STATISTIC

The required data are collected and the value of the test statistic computed.

(33)

STEP 5: DETERMINE THE PROBABILITY (CRITICAL VALUE) & MAKE DECISION

Using standard normal tables, the probability of obtaining a particular z or t-value can be calculated.

If the probability associated with the calculated or observed value of the test statistic is less than the level of significance, the null hypothesis is rejected.

Alternatively, if the absolute calculated value of the test statistic is greater than the absolute critical value of the test statistic (|TS |), the null hypothesis is

(34)

CONTINUED

Note that the two ways of testing the null hypothesis are equivalent but mathematically opposite in the direction of comparison.

If the probability of TSCAL < significance level ( )  then reject H0 but if |TSCAL | > |TSCR| then reject H0.

(35)

DATA PREPARATION PROCESS

Check Questionnaire Editing the Data Coding of Data

Transcription of Data Cleaning of Data

Statistically Adjust the Data

(36)

STAGE 1: CHECKING QUESTIONNAIRE

The Questionnaire should be fully completed.

The responses should not be same for all questions.

No page is missing.

The questionnaire is received on time.

The questionnaire is answered by only those who qualify for participation.

(37)

STAGE 2: EDITING OF DATA

Editing involves the treatment of only those data which doesn’t look normal.

Treatment can be done by;

I. Returning to the Field – The questionnaires with dissatisfactory responses may be returned to the field, where the interviewers re-contact the respondents.

II. Assigning Missing Values – If returning the questionnaires to the field is not feasible, the editor may assign missing values to dissatisfactory responses.

III. Discarding Dissatisfactory Respondents – In this

(38)

STAGE 3: CODING OF DATA

Coding of responses

Assigning IDs to respondents

..\..\Paper Publications\IUP\Questionnaire.docx

(39)

CONTINUED

ID Age Gender Education Quality Price Value

1 1 1 1 4 4 4

2 2 1 2 5 3 4

3 4 1 3 3 3 2

4 3 2 2 2 4 2

5 1 2 1 4 5 3

(40)

STAGE 4: TRANSCRIPTION OF DATA

Transcribing the data through computer memory, disk, and other storage.

(41)

STAGE 5: CLEANING OF DATA

This stage involves checking consistency in data.

It identifies data that are out of range, logically inconsistent, or have extreme values. Extreme values should be closely examined.

Further, the treatment of such responses is done using following methods;

1. Substitute a neutral value

2. Substitute an imputed response

(42)

STAGE 6: STATISTICALLY ADJUSTING THE DATA

This stage involves three types of data adjustments;

1. Data Weighting

Each case or respondent in the database is assigned a weight to reflect its importance relative to other cases or respondents.

2. Variable re-specification

Involves the transformation of data to create new variables or modify existing variables. For example, the researcher may create new variables that are composites of several other variables.

3. Scale transformation and standardization

Involves a manipulation of scale values to ensure comparability with other scales or otherwise make the data suitable for analysis.

(43)

STAGE 7: SELECTING A DATA ANALYSIS STRATEGY

In this stage, following factors should be taken into account;

Look into the initial steps of marketing research process.

The characteristics of data like quantitative, qualitative, metric/non-metric, non-numeric data, nominal, ordinal, interval, ratio scale data.

(44)

Thank You

References

Related documents

Providing cer- tainty that avoided deforestation credits will be recognized in future climate change mitigation policy will encourage the development of a pre-2012 market in

The necessary set of data includes a panel of country-level exports from Sub-Saharan African countries to the United States; a set of macroeconomic variables that would

Percentage of countries with DRR integrated in climate change adaptation frameworks, mechanisms and processes Disaster risk reduction is an integral objective of

This report provides some important advances in our understanding of how the concept of planetary boundaries can be operationalised in Europe by (1) demonstrating how European

The Congo has ratified CITES and other international conventions relevant to shark conservation and management, notably the Convention on the Conservation of Migratory

Although a refined source apportionment study is needed to quantify the contribution of each source to the pollution level, road transport stands out as a key source of PM 2.5

These gains in crop production are unprecedented which is why 5 million small farmers in India in 2008 elected to plant 7.6 million hectares of Bt cotton which

INDEPENDENT MONITORING BOARD | RECOMMENDED ACTION.. Rationale: Repeatedly, in field surveys, from front-line polio workers, and in meeting after meeting, it has become clear that