Skip to contents

This function generates a simulated dataset containing a specified number of samples. Users can optionally specify block sizes to group samples into blocks. If the total number of samples is not a multiple of the block size, additional samples are generated to complete the last block.

Usage

simulate_data(n_samples, block_size = NA, seed = 123)

Arguments

n_samples

Integer; the number of samples to generate. If block_size is specified and n_samples is not a multiple of block_size, the function will generate additional samples to ensure all blocks are complete.

block_size

Integer; the size of each block for blocking variable creation. If NA (the default), no blocking is applied. If specified, block_size must be a positive integer, and the function will create a blocking variable to group samples into blocks of this size.

seed

Integer; the seed for random number generation to ensure reproducibility.

Value

A data.frame with columns for sample ID, three covariates (covariate1, covariate2, covariate3), and, if block_size is specified, a block_id column. The first two covariates are generated from uniform and normal distributions, respectively, while the third is a categorical variable with levels "A", "B", and "C". If blocking is applied, a block_id column indicates the block to which each sample belongs.

Details

The function allows for the simulation of data with or without blocking. When block_size is provided, it ensures that the data is divided into blocks of the specified size, potentially increasing the total number of samples to meet this requirement. This is particularly useful for simulations or analyses where the concept of blocks is relevant.

Examples

# Generate a dataset without blocking
simulate_data(n_samples = 100)
#>     sample_id   covariate1   covariate2 covariate3
#> 1     Sample1 0.2875775201  0.253318514          B
#> 2     Sample2 0.7883051354 -0.028546755          B
#> 3     Sample3 0.4089769218 -0.042870457          A
#> 4     Sample4 0.8830174040  1.368602284          B
#> 5     Sample5 0.9404672843 -0.225770986          A
#> 6     Sample6 0.0455564994  1.516470604          A
#> 7     Sample7 0.5281054880 -1.548752804          B
#> 8     Sample8 0.8924190444  0.584613750          C
#> 9     Sample9 0.5514350145  0.123854244          A
#> 10   Sample10 0.4566147353  0.215941569          C
#> 11   Sample11 0.9568333453  0.379639483          A
#> 12   Sample12 0.4533341562 -0.502323453          B
#> 13   Sample13 0.6775706355 -0.333207384          A
#> 14   Sample14 0.5726334020 -1.018575383          B
#> 15   Sample15 0.1029246827 -1.071791226          B
#> 16   Sample16 0.8998249704  0.303528641          B
#> 17   Sample17 0.2460877344  0.448209779          C
#> 18   Sample18 0.0420595335  0.053004227          A
#> 19   Sample19 0.3279207193  0.922267468          A
#> 20   Sample20 0.9545036491  2.050084686          A
#> 21   Sample21 0.8895393161 -0.491031166          A
#> 22   Sample22 0.6928034062 -2.309168876          C
#> 23   Sample23 0.6405068138  1.005738524          A
#> 24   Sample24 0.9942697766 -0.709200763          B
#> 25   Sample25 0.6557057991 -0.688008616          A
#> 26   Sample26 0.7085304682  1.025571370          B
#> 27   Sample27 0.5440660247 -0.284773007          B
#> 28   Sample28 0.5941420204 -1.220717712          A
#> 29   Sample29 0.2891597373  0.181303480          A
#> 30   Sample30 0.1471136473 -0.138891362          A
#> 31   Sample31 0.9630242325  0.005764186          A
#> 32   Sample32 0.9022990451  0.385280401          C
#> 33   Sample33 0.6907052784 -0.370660032          A
#> 34   Sample34 0.7954674177  0.644376549          A
#> 35   Sample35 0.0246136845 -0.220486562          A
#> 36   Sample36 0.4777959711  0.331781964          B
#> 37   Sample37 0.7584595375  1.096839013          B
#> 38   Sample38 0.2164079358  0.435181491          C
#> 39   Sample39 0.3181810076 -0.325931586          B
#> 40   Sample40 0.2316257854  1.148807618          C
#> 41   Sample41 0.1428000224  0.993503856          A
#> 42   Sample42 0.4145463358  0.548396960          A
#> 43   Sample43 0.4137243263  0.238731735          A
#> 44   Sample44 0.3688454509 -0.627906076          A
#> 45   Sample45 0.1524447477  1.360652449          A
#> 46   Sample46 0.1388060634 -0.600259587          C
#> 47   Sample47 0.2330340995  2.187332993          A
#> 48   Sample48 0.4659624503  1.532610626          A
#> 49   Sample49 0.2659726404 -0.235700359          C
#> 50   Sample50 0.8578277153 -1.026420900          C
#> 51   Sample51 0.0458311667 -0.710406564          B
#> 52   Sample52 0.4422000742  0.256883709          A
#> 53   Sample53 0.7989248456 -0.246691878          A
#> 54   Sample54 0.1218992600 -0.347542599          A
#> 55   Sample55 0.5609479838 -0.951618567          B
#> 56   Sample56 0.2065313896 -0.045027725          A
#> 57   Sample57 0.1275316502 -0.784904469          B
#> 58   Sample58 0.7533078643 -1.667941937          A
#> 59   Sample59 0.8950453592 -0.380226520          A
#> 60   Sample60 0.3744627759  0.918996609          C
#> 61   Sample61 0.6651151946 -0.575346963          C
#> 62   Sample62 0.0948406609  0.607964322          B
#> 63   Sample63 0.3839696378 -1.617882708          B
#> 64   Sample64 0.2743836446 -0.055561966          A
#> 65   Sample65 0.8146400389  0.519407204          C
#> 66   Sample66 0.4485163414  0.301153362          A
#> 67   Sample67 0.8100643530  0.105676194          A
#> 68   Sample68 0.8123895095 -0.640706008          B
#> 69   Sample69 0.7943423211 -0.849704346          A
#> 70   Sample70 0.4398316876 -1.024128791          B
#> 71   Sample71 0.7544751586  0.117646597          B
#> 72   Sample72 0.6292211316 -0.947474614          B
#> 73   Sample73 0.7101824014 -0.490557444          B
#> 74   Sample74 0.0006247733 -0.256092192          C
#> 75   Sample75 0.4753165741  1.843862005          C
#> 76   Sample76 0.2201188852 -0.651949902          C
#> 77   Sample77 0.3798165377  0.235386572          C
#> 78   Sample78 0.6127710033  0.077960850          A
#> 79   Sample79 0.3517979092 -0.961856634          C
#> 80   Sample80 0.1111354243 -0.071308086          C
#> 81   Sample81 0.2436194727  1.444550858          C
#> 82   Sample82 0.6680555874  0.451504053          B
#> 83   Sample83 0.4176467797  0.041232922          C
#> 84   Sample84 0.7881958340 -0.422496832          C
#> 85   Sample85 0.1028646443 -2.053247222          C
#> 86   Sample86 0.4348927415  1.131337213          A
#> 87   Sample87 0.9849569800 -1.460640071          A
#> 88   Sample88 0.8930511144  0.739947511          C
#> 89   Sample89 0.8864690608  1.909103569          B
#> 90   Sample90 0.1750526503 -1.443893161          C
#> 91   Sample91 0.1306956916  0.701784335          B
#> 92   Sample92 0.6531019250 -0.262197489          A
#> 93   Sample93 0.3435164723 -1.572144159          B
#> 94   Sample94 0.6567581280 -1.514667654          B
#> 95   Sample95 0.3203732425 -1.601536174          B
#> 96   Sample96 0.1876911193 -0.530906522          A
#> 97   Sample97 0.7822943013 -1.461755585          A
#> 98   Sample98 0.0935949867  0.687916773          A
#> 99   Sample99 0.4667790416  2.100108941          C
#> 100 Sample100 0.5115054599 -1.287030476          C

# Generate a dataset with blocking, block size of 10
simulate_data(n_samples = 95, block_size = 10)
#> Warning: The number of samples is not a multiple of the block size. We currently require complete blocks, so will simualte more samples than specified
#>     sample_id   covariate1   covariate2 covariate3 block_id
#> 1     Sample1 0.2875775201  0.253318514          B  block_1
#> 2     Sample2 0.7883051354 -0.028546755          B  block_1
#> 3     Sample3 0.4089769218 -0.042870457          A  block_1
#> 4     Sample4 0.8830174040  1.368602284          B  block_1
#> 5     Sample5 0.9404672843 -0.225770986          A  block_1
#> 6     Sample6 0.0455564994  1.516470604          A  block_1
#> 7     Sample7 0.5281054880 -1.548752804          B  block_1
#> 8     Sample8 0.8924190444  0.584613750          C  block_1
#> 9     Sample9 0.5514350145  0.123854244          A  block_1
#> 10   Sample10 0.4566147353  0.215941569          C  block_1
#> 11   Sample11 0.9568333453  0.379639483          A  block_2
#> 12   Sample12 0.4533341562 -0.502323453          B  block_2
#> 13   Sample13 0.6775706355 -0.333207384          A  block_2
#> 14   Sample14 0.5726334020 -1.018575383          B  block_2
#> 15   Sample15 0.1029246827 -1.071791226          B  block_2
#> 16   Sample16 0.8998249704  0.303528641          B  block_2
#> 17   Sample17 0.2460877344  0.448209779          C  block_2
#> 18   Sample18 0.0420595335  0.053004227          A  block_2
#> 19   Sample19 0.3279207193  0.922267468          A  block_2
#> 20   Sample20 0.9545036491  2.050084686          A  block_2
#> 21   Sample21 0.8895393161 -0.491031166          A  block_3
#> 22   Sample22 0.6928034062 -2.309168876          C  block_3
#> 23   Sample23 0.6405068138  1.005738524          A  block_3
#> 24   Sample24 0.9942697766 -0.709200763          B  block_3
#> 25   Sample25 0.6557057991 -0.688008616          A  block_3
#> 26   Sample26 0.7085304682  1.025571370          B  block_3
#> 27   Sample27 0.5440660247 -0.284773007          B  block_3
#> 28   Sample28 0.5941420204 -1.220717712          A  block_3
#> 29   Sample29 0.2891597373  0.181303480          A  block_3
#> 30   Sample30 0.1471136473 -0.138891362          A  block_3
#> 31   Sample31 0.9630242325  0.005764186          A  block_4
#> 32   Sample32 0.9022990451  0.385280401          C  block_4
#> 33   Sample33 0.6907052784 -0.370660032          A  block_4
#> 34   Sample34 0.7954674177  0.644376549          A  block_4
#> 35   Sample35 0.0246136845 -0.220486562          A  block_4
#> 36   Sample36 0.4777959711  0.331781964          B  block_4
#> 37   Sample37 0.7584595375  1.096839013          B  block_4
#> 38   Sample38 0.2164079358  0.435181491          C  block_4
#> 39   Sample39 0.3181810076 -0.325931586          B  block_4
#> 40   Sample40 0.2316257854  1.148807618          C  block_4
#> 41   Sample41 0.1428000224  0.993503856          A  block_5
#> 42   Sample42 0.4145463358  0.548396960          A  block_5
#> 43   Sample43 0.4137243263  0.238731735          A  block_5
#> 44   Sample44 0.3688454509 -0.627906076          A  block_5
#> 45   Sample45 0.1524447477  1.360652449          A  block_5
#> 46   Sample46 0.1388060634 -0.600259587          C  block_5
#> 47   Sample47 0.2330340995  2.187332993          A  block_5
#> 48   Sample48 0.4659624503  1.532610626          A  block_5
#> 49   Sample49 0.2659726404 -0.235700359          C  block_5
#> 50   Sample50 0.8578277153 -1.026420900          C  block_5
#> 51   Sample51 0.0458311667 -0.710406564          B  block_6
#> 52   Sample52 0.4422000742  0.256883709          A  block_6
#> 53   Sample53 0.7989248456 -0.246691878          A  block_6
#> 54   Sample54 0.1218992600 -0.347542599          A  block_6
#> 55   Sample55 0.5609479838 -0.951618567          B  block_6
#> 56   Sample56 0.2065313896 -0.045027725          A  block_6
#> 57   Sample57 0.1275316502 -0.784904469          B  block_6
#> 58   Sample58 0.7533078643 -1.667941937          A  block_6
#> 59   Sample59 0.8950453592 -0.380226520          A  block_6
#> 60   Sample60 0.3744627759  0.918996609          C  block_6
#> 61   Sample61 0.6651151946 -0.575346963          C  block_7
#> 62   Sample62 0.0948406609  0.607964322          B  block_7
#> 63   Sample63 0.3839696378 -1.617882708          B  block_7
#> 64   Sample64 0.2743836446 -0.055561966          A  block_7
#> 65   Sample65 0.8146400389  0.519407204          C  block_7
#> 66   Sample66 0.4485163414  0.301153362          A  block_7
#> 67   Sample67 0.8100643530  0.105676194          A  block_7
#> 68   Sample68 0.8123895095 -0.640706008          B  block_7
#> 69   Sample69 0.7943423211 -0.849704346          A  block_7
#> 70   Sample70 0.4398316876 -1.024128791          B  block_7
#> 71   Sample71 0.7544751586  0.117646597          B  block_8
#> 72   Sample72 0.6292211316 -0.947474614          B  block_8
#> 73   Sample73 0.7101824014 -0.490557444          B  block_8
#> 74   Sample74 0.0006247733 -0.256092192          C  block_8
#> 75   Sample75 0.4753165741  1.843862005          C  block_8
#> 76   Sample76 0.2201188852 -0.651949902          C  block_8
#> 77   Sample77 0.3798165377  0.235386572          C  block_8
#> 78   Sample78 0.6127710033  0.077960850          A  block_8
#> 79   Sample79 0.3517979092 -0.961856634          C  block_8
#> 80   Sample80 0.1111354243 -0.071308086          C  block_8
#> 81   Sample81 0.2436194727  1.444550858          C  block_9
#> 82   Sample82 0.6680555874  0.451504053          B  block_9
#> 83   Sample83 0.4176467797  0.041232922          C  block_9
#> 84   Sample84 0.7881958340 -0.422496832          C  block_9
#> 85   Sample85 0.1028646443 -2.053247222          C  block_9
#> 86   Sample86 0.4348927415  1.131337213          A  block_9
#> 87   Sample87 0.9849569800 -1.460640071          A  block_9
#> 88   Sample88 0.8930511144  0.739947511          C  block_9
#> 89   Sample89 0.8864690608  1.909103569          B  block_9
#> 90   Sample90 0.1750526503 -1.443893161          C  block_9
#> 91   Sample91 0.1306956916  0.701784335          B block_10
#> 92   Sample92 0.6531019250 -0.262197489          A block_10
#> 93   Sample93 0.3435164723 -1.572144159          B block_10
#> 94   Sample94 0.6567581280 -1.514667654          B block_10
#> 95   Sample95 0.3203732425 -1.601536174          B block_10
#> 96   Sample96 0.1876911193 -0.530906522          A block_10
#> 97   Sample97 0.7822943013 -1.461755585          A block_10
#> 98   Sample98 0.0935949867  0.687916773          A block_10
#> 99   Sample99 0.4667790416  2.100108941          C block_10
#> 100 Sample100 0.5115054599 -1.287030476          C block_10