Simulate Data with Optional Blocking
simulate_data.Rd
This function generates a simulated dataset containing a specified number of samples. Users can optionally specify block sizes to group samples into blocks. If the total number of samples is not a multiple of the block size, additional samples are generated to complete the last block.
Arguments
- n_samples
Integer; the number of samples to generate. If
block_size
is specified andn_samples
is not a multiple ofblock_size
, the function will generate additional samples to ensure all blocks are complete.- block_size
Integer; the size of each block for blocking variable creation. If
NA
(the default), no blocking is applied. If specified,block_size
must be a positive integer, and the function will create a blocking variable to group samples into blocks of this size.- seed
Integer; the seed for random number generation to ensure reproducibility.
Value
A data.frame
with columns for sample ID, three covariates (covariate1
, covariate2
, covariate3
), and, if block_size
is specified, a block_id
column. The first two covariates are generated from uniform and normal distributions, respectively, while the third is a categorical variable with levels "A", "B", and "C". If blocking is applied, a block_id
column indicates the block to which each sample belongs.
Details
The function allows for the simulation of data with or without blocking. When block_size
is provided, it ensures that the data is divided into blocks of the specified size, potentially increasing the total number of samples to meet this requirement. This is particularly useful for simulations or analyses where the concept of blocks is relevant.
Examples
# Generate a dataset without blocking
simulate_data(n_samples = 100)
#> sample_id covariate1 covariate2 covariate3
#> 1 Sample1 0.2875775201 0.253318514 B
#> 2 Sample2 0.7883051354 -0.028546755 B
#> 3 Sample3 0.4089769218 -0.042870457 A
#> 4 Sample4 0.8830174040 1.368602284 B
#> 5 Sample5 0.9404672843 -0.225770986 A
#> 6 Sample6 0.0455564994 1.516470604 A
#> 7 Sample7 0.5281054880 -1.548752804 B
#> 8 Sample8 0.8924190444 0.584613750 C
#> 9 Sample9 0.5514350145 0.123854244 A
#> 10 Sample10 0.4566147353 0.215941569 C
#> 11 Sample11 0.9568333453 0.379639483 A
#> 12 Sample12 0.4533341562 -0.502323453 B
#> 13 Sample13 0.6775706355 -0.333207384 A
#> 14 Sample14 0.5726334020 -1.018575383 B
#> 15 Sample15 0.1029246827 -1.071791226 B
#> 16 Sample16 0.8998249704 0.303528641 B
#> 17 Sample17 0.2460877344 0.448209779 C
#> 18 Sample18 0.0420595335 0.053004227 A
#> 19 Sample19 0.3279207193 0.922267468 A
#> 20 Sample20 0.9545036491 2.050084686 A
#> 21 Sample21 0.8895393161 -0.491031166 A
#> 22 Sample22 0.6928034062 -2.309168876 C
#> 23 Sample23 0.6405068138 1.005738524 A
#> 24 Sample24 0.9942697766 -0.709200763 B
#> 25 Sample25 0.6557057991 -0.688008616 A
#> 26 Sample26 0.7085304682 1.025571370 B
#> 27 Sample27 0.5440660247 -0.284773007 B
#> 28 Sample28 0.5941420204 -1.220717712 A
#> 29 Sample29 0.2891597373 0.181303480 A
#> 30 Sample30 0.1471136473 -0.138891362 A
#> 31 Sample31 0.9630242325 0.005764186 A
#> 32 Sample32 0.9022990451 0.385280401 C
#> 33 Sample33 0.6907052784 -0.370660032 A
#> 34 Sample34 0.7954674177 0.644376549 A
#> 35 Sample35 0.0246136845 -0.220486562 A
#> 36 Sample36 0.4777959711 0.331781964 B
#> 37 Sample37 0.7584595375 1.096839013 B
#> 38 Sample38 0.2164079358 0.435181491 C
#> 39 Sample39 0.3181810076 -0.325931586 B
#> 40 Sample40 0.2316257854 1.148807618 C
#> 41 Sample41 0.1428000224 0.993503856 A
#> 42 Sample42 0.4145463358 0.548396960 A
#> 43 Sample43 0.4137243263 0.238731735 A
#> 44 Sample44 0.3688454509 -0.627906076 A
#> 45 Sample45 0.1524447477 1.360652449 A
#> 46 Sample46 0.1388060634 -0.600259587 C
#> 47 Sample47 0.2330340995 2.187332993 A
#> 48 Sample48 0.4659624503 1.532610626 A
#> 49 Sample49 0.2659726404 -0.235700359 C
#> 50 Sample50 0.8578277153 -1.026420900 C
#> 51 Sample51 0.0458311667 -0.710406564 B
#> 52 Sample52 0.4422000742 0.256883709 A
#> 53 Sample53 0.7989248456 -0.246691878 A
#> 54 Sample54 0.1218992600 -0.347542599 A
#> 55 Sample55 0.5609479838 -0.951618567 B
#> 56 Sample56 0.2065313896 -0.045027725 A
#> 57 Sample57 0.1275316502 -0.784904469 B
#> 58 Sample58 0.7533078643 -1.667941937 A
#> 59 Sample59 0.8950453592 -0.380226520 A
#> 60 Sample60 0.3744627759 0.918996609 C
#> 61 Sample61 0.6651151946 -0.575346963 C
#> 62 Sample62 0.0948406609 0.607964322 B
#> 63 Sample63 0.3839696378 -1.617882708 B
#> 64 Sample64 0.2743836446 -0.055561966 A
#> 65 Sample65 0.8146400389 0.519407204 C
#> 66 Sample66 0.4485163414 0.301153362 A
#> 67 Sample67 0.8100643530 0.105676194 A
#> 68 Sample68 0.8123895095 -0.640706008 B
#> 69 Sample69 0.7943423211 -0.849704346 A
#> 70 Sample70 0.4398316876 -1.024128791 B
#> 71 Sample71 0.7544751586 0.117646597 B
#> 72 Sample72 0.6292211316 -0.947474614 B
#> 73 Sample73 0.7101824014 -0.490557444 B
#> 74 Sample74 0.0006247733 -0.256092192 C
#> 75 Sample75 0.4753165741 1.843862005 C
#> 76 Sample76 0.2201188852 -0.651949902 C
#> 77 Sample77 0.3798165377 0.235386572 C
#> 78 Sample78 0.6127710033 0.077960850 A
#> 79 Sample79 0.3517979092 -0.961856634 C
#> 80 Sample80 0.1111354243 -0.071308086 C
#> 81 Sample81 0.2436194727 1.444550858 C
#> 82 Sample82 0.6680555874 0.451504053 B
#> 83 Sample83 0.4176467797 0.041232922 C
#> 84 Sample84 0.7881958340 -0.422496832 C
#> 85 Sample85 0.1028646443 -2.053247222 C
#> 86 Sample86 0.4348927415 1.131337213 A
#> 87 Sample87 0.9849569800 -1.460640071 A
#> 88 Sample88 0.8930511144 0.739947511 C
#> 89 Sample89 0.8864690608 1.909103569 B
#> 90 Sample90 0.1750526503 -1.443893161 C
#> 91 Sample91 0.1306956916 0.701784335 B
#> 92 Sample92 0.6531019250 -0.262197489 A
#> 93 Sample93 0.3435164723 -1.572144159 B
#> 94 Sample94 0.6567581280 -1.514667654 B
#> 95 Sample95 0.3203732425 -1.601536174 B
#> 96 Sample96 0.1876911193 -0.530906522 A
#> 97 Sample97 0.7822943013 -1.461755585 A
#> 98 Sample98 0.0935949867 0.687916773 A
#> 99 Sample99 0.4667790416 2.100108941 C
#> 100 Sample100 0.5115054599 -1.287030476 C
# Generate a dataset with blocking, block size of 10
simulate_data(n_samples = 95, block_size = 10)
#> Warning: The number of samples is not a multiple of the block size. We currently require complete blocks, so will simualte more samples than specified
#> sample_id covariate1 covariate2 covariate3 block_id
#> 1 Sample1 0.2875775201 0.253318514 B block_1
#> 2 Sample2 0.7883051354 -0.028546755 B block_1
#> 3 Sample3 0.4089769218 -0.042870457 A block_1
#> 4 Sample4 0.8830174040 1.368602284 B block_1
#> 5 Sample5 0.9404672843 -0.225770986 A block_1
#> 6 Sample6 0.0455564994 1.516470604 A block_1
#> 7 Sample7 0.5281054880 -1.548752804 B block_1
#> 8 Sample8 0.8924190444 0.584613750 C block_1
#> 9 Sample9 0.5514350145 0.123854244 A block_1
#> 10 Sample10 0.4566147353 0.215941569 C block_1
#> 11 Sample11 0.9568333453 0.379639483 A block_2
#> 12 Sample12 0.4533341562 -0.502323453 B block_2
#> 13 Sample13 0.6775706355 -0.333207384 A block_2
#> 14 Sample14 0.5726334020 -1.018575383 B block_2
#> 15 Sample15 0.1029246827 -1.071791226 B block_2
#> 16 Sample16 0.8998249704 0.303528641 B block_2
#> 17 Sample17 0.2460877344 0.448209779 C block_2
#> 18 Sample18 0.0420595335 0.053004227 A block_2
#> 19 Sample19 0.3279207193 0.922267468 A block_2
#> 20 Sample20 0.9545036491 2.050084686 A block_2
#> 21 Sample21 0.8895393161 -0.491031166 A block_3
#> 22 Sample22 0.6928034062 -2.309168876 C block_3
#> 23 Sample23 0.6405068138 1.005738524 A block_3
#> 24 Sample24 0.9942697766 -0.709200763 B block_3
#> 25 Sample25 0.6557057991 -0.688008616 A block_3
#> 26 Sample26 0.7085304682 1.025571370 B block_3
#> 27 Sample27 0.5440660247 -0.284773007 B block_3
#> 28 Sample28 0.5941420204 -1.220717712 A block_3
#> 29 Sample29 0.2891597373 0.181303480 A block_3
#> 30 Sample30 0.1471136473 -0.138891362 A block_3
#> 31 Sample31 0.9630242325 0.005764186 A block_4
#> 32 Sample32 0.9022990451 0.385280401 C block_4
#> 33 Sample33 0.6907052784 -0.370660032 A block_4
#> 34 Sample34 0.7954674177 0.644376549 A block_4
#> 35 Sample35 0.0246136845 -0.220486562 A block_4
#> 36 Sample36 0.4777959711 0.331781964 B block_4
#> 37 Sample37 0.7584595375 1.096839013 B block_4
#> 38 Sample38 0.2164079358 0.435181491 C block_4
#> 39 Sample39 0.3181810076 -0.325931586 B block_4
#> 40 Sample40 0.2316257854 1.148807618 C block_4
#> 41 Sample41 0.1428000224 0.993503856 A block_5
#> 42 Sample42 0.4145463358 0.548396960 A block_5
#> 43 Sample43 0.4137243263 0.238731735 A block_5
#> 44 Sample44 0.3688454509 -0.627906076 A block_5
#> 45 Sample45 0.1524447477 1.360652449 A block_5
#> 46 Sample46 0.1388060634 -0.600259587 C block_5
#> 47 Sample47 0.2330340995 2.187332993 A block_5
#> 48 Sample48 0.4659624503 1.532610626 A block_5
#> 49 Sample49 0.2659726404 -0.235700359 C block_5
#> 50 Sample50 0.8578277153 -1.026420900 C block_5
#> 51 Sample51 0.0458311667 -0.710406564 B block_6
#> 52 Sample52 0.4422000742 0.256883709 A block_6
#> 53 Sample53 0.7989248456 -0.246691878 A block_6
#> 54 Sample54 0.1218992600 -0.347542599 A block_6
#> 55 Sample55 0.5609479838 -0.951618567 B block_6
#> 56 Sample56 0.2065313896 -0.045027725 A block_6
#> 57 Sample57 0.1275316502 -0.784904469 B block_6
#> 58 Sample58 0.7533078643 -1.667941937 A block_6
#> 59 Sample59 0.8950453592 -0.380226520 A block_6
#> 60 Sample60 0.3744627759 0.918996609 C block_6
#> 61 Sample61 0.6651151946 -0.575346963 C block_7
#> 62 Sample62 0.0948406609 0.607964322 B block_7
#> 63 Sample63 0.3839696378 -1.617882708 B block_7
#> 64 Sample64 0.2743836446 -0.055561966 A block_7
#> 65 Sample65 0.8146400389 0.519407204 C block_7
#> 66 Sample66 0.4485163414 0.301153362 A block_7
#> 67 Sample67 0.8100643530 0.105676194 A block_7
#> 68 Sample68 0.8123895095 -0.640706008 B block_7
#> 69 Sample69 0.7943423211 -0.849704346 A block_7
#> 70 Sample70 0.4398316876 -1.024128791 B block_7
#> 71 Sample71 0.7544751586 0.117646597 B block_8
#> 72 Sample72 0.6292211316 -0.947474614 B block_8
#> 73 Sample73 0.7101824014 -0.490557444 B block_8
#> 74 Sample74 0.0006247733 -0.256092192 C block_8
#> 75 Sample75 0.4753165741 1.843862005 C block_8
#> 76 Sample76 0.2201188852 -0.651949902 C block_8
#> 77 Sample77 0.3798165377 0.235386572 C block_8
#> 78 Sample78 0.6127710033 0.077960850 A block_8
#> 79 Sample79 0.3517979092 -0.961856634 C block_8
#> 80 Sample80 0.1111354243 -0.071308086 C block_8
#> 81 Sample81 0.2436194727 1.444550858 C block_9
#> 82 Sample82 0.6680555874 0.451504053 B block_9
#> 83 Sample83 0.4176467797 0.041232922 C block_9
#> 84 Sample84 0.7881958340 -0.422496832 C block_9
#> 85 Sample85 0.1028646443 -2.053247222 C block_9
#> 86 Sample86 0.4348927415 1.131337213 A block_9
#> 87 Sample87 0.9849569800 -1.460640071 A block_9
#> 88 Sample88 0.8930511144 0.739947511 C block_9
#> 89 Sample89 0.8864690608 1.909103569 B block_9
#> 90 Sample90 0.1750526503 -1.443893161 C block_9
#> 91 Sample91 0.1306956916 0.701784335 B block_10
#> 92 Sample92 0.6531019250 -0.262197489 A block_10
#> 93 Sample93 0.3435164723 -1.572144159 B block_10
#> 94 Sample94 0.6567581280 -1.514667654 B block_10
#> 95 Sample95 0.3203732425 -1.601536174 B block_10
#> 96 Sample96 0.1876911193 -0.530906522 A block_10
#> 97 Sample97 0.7822943013 -1.461755585 A block_10
#> 98 Sample98 0.0935949867 0.687916773 A block_10
#> 99 Sample99 0.4667790416 2.100108941 C block_10
#> 100 Sample100 0.5115054599 -1.287030476 C block_10