Below are three examples that illustrate how to perform Simple Random Sampling from a worksheet, using with and without replacement, and using Stratified Random Sampling.

Sampling from a Worksheet Using Simple Random Sampling

On the Analytic Solver Data Science ribbon, click Help - Example Models, and select Forecasting/Data Science Examples to open the data set Sampling.xlsx. This data set contains a variable ID for the record identification and seven variables, v1, v2, v7, v8, v9, v10, v11.

Sampling.xlsx

Click a cell within the data, then on the Analytic Solver Data Science ribbon, from the Data tab, select Get Data - Worksheet to open the Sample From Worksheet dialog.

In this example, the default option, Simple Random Sampling, will be used.

Under Variables, select all variables in the Variables list, and click > to include them in the Variables in Sampled Data list, then click OK.

A portion of the output is shown below.

The output is a simple random sample without replacement, with a default random seed setting of 12345. The desired sample size is 86 records, as displayed in the Sample Size field.

Sampling from a Worksheet Using Sampling with Replacement

On the Analytic Solver Data Science ribbon, from the Data tab, select Get Data - Worksheet to open the Sample From Worksheet dialog.

From the Variables list, ;select all variables ;and click > to include all in the Variables in Sampled Data list. Check Sample with replacement, and enter 300 for Desired sample size. Analytic Solver Data Science generates a sample with a larger number of records than the data set. Click OK.

The output below indicates True for Sample with replacement. As a result, the Desired sample size is greater than the number of records in the input data. Looking closely at the ID column, you'll see that multiple records have been sampled more than once.

Sampling from a Worksheet Using Stratified Random Sampling

The following examples illustrate how to sample from a worksheet using stratified random sampling.

Click back to the data worksheet, select all variables from the Variables list, then click > to include them in the Variables in Sampled Data list. At Desired sample size, enter 86, and at Set seed, enter 12345. Select Stratified random sampling. 

Next to Stratum variable, click the down arrow and select v8 (Analytic Solver Basic allows only those variables that have less than 30 distinct values. Analytic Solver Comprehensive and Analytic Solver Data Science allow variables with an unlimited amount of distinct values.) The strata number is automatically displayed once you select v8. Select Proportional to stratum size, then click OK.

Analytic Solver Data Science calculated the percentage representation of V8 in the dataset and maintained that percentage in the sample.

Click back to the Data worksheet and click a cell. On the Analytic Solver Data Science ribbon, select Get Data - Worksheet to open the Sample From Worksheet dialog. 

From the Variables list, select all variables and click > to include them in the Variables in Sampled Data list. Select Stratified random sampling. Select v8 as the Stratum variable. The #strata is displayed automatically. Select Equal from each stratum, please specify # records and enter #records. This number should not be greater than the smallest stratum size. In this case, the smallest stratum size is 8. (Note: The smallest stratum size is displayed automatically in the field next to the option Equal from each stratum, please specify # records.)  Enter 7, which is less than the limit of 8, and then click OK.

In the output, the number of records in the sampled data is 56, or 7 records per stratum for 8 strata (7 * 8 = 56).

If a sample with an equal number of records for each stratum but of bigger size is desired, use the same options above for Sample with replacement.

Click back to the Data worksheet. On the Analytic Solver Data Science ribbon, from the Data tab, select Get Data - Worksheet to open the Sample From Worksheet dialog. Select Sample with replacement and Stratified random sampling. Select v8 for the Stratum variable. Select Equal from each stratum, please specify # records, and enter 20. Though the smallest stratum size is 8 in this data set, more records for the sample can be acquired since Sample with replacement was selected. Click OK.

Since the output sample has 20 records per stratum, the #records in sampled data is 160 (20 records per stratum for 8 strata).

Click back to the Data worksheet.  On the Analytic Solver Data Science ;ribbon, from the Data tab, select Get Data - Worksheet to open the Sample From Worksheet dialog. From the Variables list, select all Variables and click > to include them in the Variables in Sampled Data list. Select Stratified random sampling, then for the Stratum variable, select v8, and select Equal from each stratum,  # records = smallest stratum size. The edit box to the right of the option displays the number 8 (this is the smallest stratum size). Click OK.

The output is displayed below.

Since the output sample has eight records per stratum, the #records in sampled data is 64 (eight records per stratum for eight strata).