MATE 252 NİN WEB SAYFASI

ANASAYFA

7. BÖLÜM Örnek ve Örnekleme Dağılımı

 

 

Anakütle Dağılımı Tanımı:

 

Örneği seçtiğimiz dağılıma denir. Genellikle bu dağılım bilinmez ve biz bu dağılımın özellikleri yani parametreleri hakkında çıkarımlar yaparız ve bu parametreler  ve  olarak adlandırılır ve anakütlenin merkezi eğiliminin nasıl olduğunu bu parametreler belirler. N ilede anakütlenin genişliğini belirleriz.

 

Örnek Dağılımınnın Tanımı:

 

X1, X2, X3,……, Xn örnek gözlemlemelerin dağılımıdır. Veriler histogram şeklinde grafiklenerek veya örnek ortalamasıve örnek standard sapması S olacak şekilde rakamsal olarak da gösterilebilinir. Bu gözlemlemde n nekadar geniş olursa ve S  o kadar anakütlenin parametrelerine yakın olur. Hatta n=N olursa ana kütlenin kendisi ollur . The larger the sample size n, the closer the sample statistics such as fall to the population parameters, such as

 

Örnekleme Dağılımının Tanımı:

 

Bu (örnek ortalamalarının) ihtimal dağılımıdır. Örnekleme dağılımı örneğin genişliğine göre değişen verilerin şeklini tanımlar. Bu dağılım, dağılım istatistiklerinin (, ) anakütle parametrelerinin belirli komşuluğunda olması ile ilgili ihtimaller belirlememize yardimci olur.

 

Örnek:

Üç dağılımlı havuz problemi.

İstanbul için milletvekili seçimi problemine bir göz atalım.

Y=0  CHP  ve Y=1 AKP için 4 milyon seçmenin verdiği rakamlar olsun. Farzedelimki seçmenlerin yarısı CHP ve yarisida AKP desin o halde  =0.(0,5)+1.(0,5)=0,5

ve =0,5, =0,5 olarak hesaplanır ve dağılım çan eğrisi şeklinde değil.

 

Genellikle anakütlenin ne olduğu bilinmez ve örnek genişliği n olan örneklemeler yapılır. Bu sefer n=100 olacak şekilde anakütleden rastgele örnekler alalım. 100 örneğin histogrami bize örnek dağılımını verir. Burada 46 kişi AKP 54 kişide CHP demiş olabilir. O halde örnek dağılımın istatistikleri =0,46 ve S=0,498 olur. Şayet tüm İstanbul halkını örnek olarak alsa idik bu iki kesikli dağılım birbirinin aynisi olacaktı. n=100 için örnek dağılımımızda kesikli. MLT den biliyoruzki  =0.5 ve standard hata  n>30 için

örnekleme dağılımı yaklaşık olarak normaldir. Ve n N e yaklaştıkça örnekleme normal dağılımı gittikçe daralır ve n N olduğunda ise = de sosuza doğru giden bir doğru olur. Yani tüm anakütle örneklendiğinde = nin olması ihtimali 1 dir.

 

Uygulamalar

 

1)      Türkiye de ürün başına fiyat ortalama1,20YTL dir. Bunu ürünlerin ürünbaşına beklenen değeri olarak alabiliriz bu arada standard sapmayıda 0.10 olarak kabul edelim. Bu anakütleden 50 parakande mağazası rastgele seçilmiştir. Ürünbaşına ortalama fiyat belirlenirken bu mağazalardan alınan örnekler kullanılmıştır. (mağazadaki tüm ürünlerin ortalama fiyati bellidir)

a)      Show the sampling distribution of the sample meanswhere  is the sample mean price per unit for the 50 retail stores.

b)      What is the probability that the simple random sample will provide a sample mean with in 2 YKURUS of the population mean?

c)      What is the probability that the simple random sample will provide a sample mean with in 1 YKURUS of the population mean?

2)      Assume that 15% of the parts produced in an assembly line operation are defective, but that the firm’s production manager is not aware of this situation. Assume further that 50 parts are tested by the quality assurance department to determine the quality of the assembly operation. Letbe a sample proportion found defective by the quality assurance test.

a)      Show the sampling distribution for

b)      What is the probability that the sample proportion will be within 0.03 of the population proportion that is defective.

c)      If the test shows =0.10 or more, the assembly line operation will be shut down to check for the cause of the defects. What is the probability that the sample of 50 parts will lead to conclusion that the assembly line be shut down?

3) A market research firm conducts telephone surveys with a 40% historical response rate. What is the probability that in a new sample of 400 telephone numbers, at least 150 individuals will cooperate and respond to the questions? In other words, what is the probability that the sample proportion will be at least 150/400=0,375

 

 


ÖRNEKLEME METODLARI


If you survey every person or a whole set of units in a population you are taking a census. However, this method is often impracticable; as it’s often very costly in terms of time and money. For example, a survey that asks complicated questions may need to use trained interviewers to ensure questions are understood. This may be too expensive if every person in the population is to be included.

Sometimes taking a census can be impossible. For example, a car manufacturer might want to test the strength of cars being produced. Obviously, each car could not be crash tested to determine its strength!
To overcome these problems, samples are taken from populations, and estimates made about the total population based on information derived from the sample. A sample must be large enough to give a good representation of the population, but small enough to be manageable. In this section the two major types of sampling, random and non-random, will be examined.

RANDOM SAMPLING

In random sampling, all items have some chance of selection that can be calculated. Random sampling technique ensures that bias (see: page 179) is not introduced regarding who is included in the survey. Five common random sampling techniques are:

 

 

  • simple random sampling,
  • systematic sampling,
  • stratified sampling,
  • cluster sampling, and
  • multi-stage sampling.

 

 

 

 

 

 


SIMPLE RANDOM SAMPLING

With simple random sampling, each item in a population has an equal chance of inclusion in the sample. For example, each name in a telephone book could be numbered sequentially. If the sample size was to include 2,000 people, then 2,000 numbers could be randomly generated by computer or numbers could be picked out of a hat. These numbers could then be matched to names in the telephone book, thereby providing a list of 2,000 people.

 

A Tattslotto draw is a good example of simple random sampling. A sample of 6 numbers is randomly generated from a population of 45, with each number having an equal chance of being selected.

The advantage of simple random sampling is that it is simple and easy to apply when small populations are involved. However, because every person or item in a population has to be listed before the corresponding random numbers can be read, this method is very cumbersome to use for large populations.


SYSTEMATIC SAMPLING

Systematic sampling, sometimes called interval sampling, means that there is a gap, or interval, between each selection. This method is often used in industry, where an item is selected for testing from a production line (say, every fifteen minutes) to ensure that machines and equipment are working to specification.

Alternatively, the manufacturer might decide to select every 20th item on a production line to test for defects and quality. This technique requires the first item to be selected at random as a starting point for testing and, thereafter, every 20th item is chosen.
This technique could also be used when questioning people in a sample survey. A market researcher might select every 10th person who enters a particular store, after selecting a person at random as a starting point; or interview occupants of every 5th house in a street, after selecting a house at random as a starting point.

It may be that a researcher wants to select a fixed size sample. In this case, it is first necessary to know the whole population size from which the sample is being selected. The appropriate sampling interval, I, is then calculated by dividing population size, N, by required sample size, n, as follows:

 


I = N/n

 

 

 

 

ÖRNEK
If a systematic sample of 500 students were to be carried out in a university with an enrolled population of 10,000, the sampling interval would be:

 

I = N/n = 10,000/500 =20

Note: if I is not a whole number, then it is rounded to the nearest whole number.

All students would be assigned sequential numbers. The starting point would be chosen by selecting a random number between 1 and 20. If this number was 9, then the 9th student on the list of students would be selected along with every following 20th student. The sample of students would be those corresponding to student numbers 9, 29, 49, 69, ........ 9929, 9949, 9969 and 9989.

The advantage of systematic sampling is that it is simpler to select one random number and then every ‘Ith’ (e.g. 20th) member on the list, than to select as many random numbers as sample size. It also gives a good spread right across the population. A disadvantage is that you may need a list to start with, if you wish to know your sample size and calculate your sampling interval.

STRATIFIED SAMPLING

A general problem with random sampling is that you could, by chance, miss out a particular group in the sample. However, if you form the population into groups, and sample from each group, you can make sure the sample is representative.

In stratified sampling, the population is divided into groups called strata. A sample is then drawn from within these strata. Some examples of strata commonly used by the ABS are States, Age and Sex. Other strata may be religion, academic ability or marital status.
ÖRNEK

      • The committee of a school of 1,000 students wishes to assess any reaction to the re-introduction of Pastoral Care into the school timetable. To ensure a representative sample of students from all year levels, the committee uses the stratified sampling technique.


        In this case the strata are the year levels. Within each strata the committee selects a sample. So, in a sample of 100 students, all year levels would be included. The students in the sample would be selected using simple random sampling or systematic sampling within each strata

Stratification is most useful when the stratifying variables are simple to work with, easy to observe and closely related to the topic of the survey.

An important aspect of stratification is that it can be used to select more of one group than another. You may do this if you feel that responses are more likely to vary in one group than another. So, if you know everyone in one group has much the same value, you only need a small sample to get information for that group; whereas in another group, the values may differ widely and a bigger sample is needed.

If you want to combine group level information to get an answer for the whole population, you have to take account of what proportion you selected from each group (see ‘Bias in Estimation’ on page 186).

 

CLUSTER SAMPLING

It is sometimes expensive to spread your sample across the population as a whole. For example, travel can become expensive if you are using interviewers to travel between people spread all over the country. To reduce costs you may choose a cluster sampling technique.

Cluster sampling divides the population into groups, or clusters. A number of clusters are selected randomly to represent the population, and then all units within selected clusters are included in the sample. No units from non-selected clusters are included in the sample. They are represented by those from selected clusters. This differs from stratified sampling, where some units are selected from each group.

Examples of clusters may be factories, schools and geographic areas such as electoral sub-divisions. The selected clusters are then used to represent the population.
ÖRNEK

      • Suppose an organisation wishes to find out which sports Year 11 students are participating in across Australia. It would be too costly and take too long to survey every student, or even some students from every school. Instead, 100 schools are randomly selected from all over Australia.

        These schools are considered to be clusters. Then, every Year 11 student in these 100 schools is surveyed. In effect, students in the sample of 100 schools represent all Year 11 students in Australia.

Cluster sampling has several advantages: reduced costs, simplified field work and administration is more convenient. Instead of having a sample scattered over the entire coverage area, the sample is more localised in relatively few centres (clusters).

Cluster sampling’s disadvantage is that less accurate results are often obtained due to higher sampling error than for simple random sampling with the same sample size. In the above example, you might expect to get more accurate estimates from randomly selecting students across all schools than from randomly selecting 100 schools and taking every student in those chosen.


MULTI-STAGE SAMPLING

Multi-stage sampling is like cluster sampling, but involves selecting a sample within each chosen cluster, rather than including all units in the cluster. Thus, multi-stage sampling involves selecting a sample in at least two stages. In the first stage, large groups or clusters are selected. These clusters are designed to contain more population units than are required for the final sample.

In the second stage, population units are chosen from selected clusters to derive a final sample. If more than two stages are used, the process of choosing population units within clusters continues until the final sample is achieved.
ÖRNEK

      • An example of multi-stage sampling is where, firstly, electoral sub-divisions (clusters) are sampled from a city or state. Secondly, blocks of houses are selected from within the electoral sub-divisions and, thirdly, individual houses are selected from within the selected blocks of houses.

The advantages of multi-stage sampling are convenience, economy and efficiency. Multi-stage sampling does not require a complete list of members in the target population, which greatly reduces sample preparation cost. The list of members is required only for those clusters used in the final stage. The main disadvantage of multi-stage sampling is the same as for cluster sampling: lower accuracy due to higher sampling error