Class WeightedStringsFromCSV
java.lang.Object
io.nosqlbench.virtdata.library.basics.shared.distributions.WeightedStringsFromCSV
- All Implemented Interfaces:
LongFunction<String>
- Direct Known Subclasses:
FirstNames,LastNames
Provides sampling of a given field in a CSV file according
to discrete probabilities. The CSV file must have headers which can
be used to find the named columns for value and weight. The value column
contains the string result to be returned by the function. The weight
column contains the floating-point weight or mass associated with the
value on the same line. All the weights are normalized automatically.
If there are multiple file names containing the same format, then they will all be read in the same way.
If the first word in the filenames list is 'map', then the values will not be pseudo-randomly selected. Instead, they will be mapped over in some other unsorted and stable order as input values vary from 0L to Long.MAX_VALUE.
Generally, you want to leave out the 'map' directive to get "random sampling" of these values.
This function works the same as the three-parametered form of WeightedStrings, which is deprecated in lieu of this one. Use this one instead.
-
Constructor Summary
ConstructorsConstructorDescriptionWeightedStringsFromCSV(String valueColumn, String weightColumn, String... filenames) Create a sampler of strings from the given CSV file. -
Method Summary
-
Constructor Details
-
WeightedStringsFromCSV
Create a sampler of strings from the given CSV file. The CSV file must have plain CSV headers as its first line.- Parameters:
valueColumn- The name of the value column to be sampledweightColumn- The name of the weight column, which must be parsable as a doublefilenames- One or more file names which will be read in to the sampler buffer
-
-
Method Details
-
apply
- Specified by:
applyin interfaceLongFunction<String>
-