Package picard.illumina
Class ExtractBarcodesProgram
java.lang.Object
picard.cmdline.CommandLineProgram
picard.illumina.ExtractBarcodesProgram
- Direct Known Subclasses:
ExtractIlluminaBarcodes
,IlluminaBasecallsToFastq
,IlluminaBasecallsToSam
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final String
Column header for the first barcode sequence (preferred).static final String
Column header for the barcode name.static final String
protected Map<String,
BarcodeMetric> protected final BclQualityEvaluationStrategy
boolean
protected ReadStructure
The read structure of the actual Illumina Run, i.e.static final String
Column header for the library name.int
int
int
int
int
protected BarcodeMetric
Fields inherited from class picard.cmdline.CommandLineProgram
COMPRESSION_LEVEL, CREATE_INDEX, CREATE_MD5_FILE, GA4GH_CLIENT_SECRETS, MAX_ALLOWABLE_ONE_LINE_SUMMARY_LENGTH, MAX_RECORDS_IN_RAM, QUIET, REFERENCE_SEQUENCE, referenceSequence, specialArgumentsCollection, SYNTAX_TRANSITION_URL, TMP_DIR, USE_JDK_DEFLATER, USE_JDK_INFLATER, VALIDATION_STRINGENCY, VERBOSITY
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected String[]
collectErrorMessages
(List<String> messages, String[] superErrors) protected BarcodeExtractor
protected String[]
Parses all barcodes from input files and validates all barcodes are the same length and uniquestatic void
finalizeMetrics
(Map<String, BarcodeMetric> barcodeToMetrics, BarcodeMetric noMatchMetric) protected void
protected static htsjdk.samtools.util.Tuple<Map<String,
BarcodeMetric>, List<String>> parseInputFile
(File inputFile, ReadStructure readStructure) Parses any one of the following types of files: ExtractIlluminaBarcodes BARCODE_FILE IlluminaBasecallsToFastq MULTIPLEX_PARAMS IlluminaBasecallsToSam LIBRARY_PARAMS This will validate to file format as well as populate a Map of barcodes to metrics.Methods inherited from class picard.cmdline.CommandLineProgram
checkRInstallation, doWork, getCommandLine, getCommandLineParser, getCommandLineParserForArgs, getDefaultHeaders, getFaqLink, getMetricsFile, getPGRecord, getStandardUsagePreamble, getStandardUsagePreamble, getVersion, hasWebDocumentation, instanceMain, instanceMainWithExit, makeReferenceArgumentCollection, parseArgs, requiresReference, setDefaultHeaders, useLegacyParser
-
Field Details
-
DISTANCE_MODE
@Argument(doc="The distance metric that should be used to compare the barcode-reads and the provided barcodes for finding the best and second-best assignments.") public DistanceMetric DISTANCE_MODE -
MAX_MISMATCHES
@Argument(doc="Maximum mismatches for a barcode to be considered a match.") public int MAX_MISMATCHES -
MIN_MISMATCH_DELTA
@Argument(doc="Minimum difference between number of mismatches in the best and second best barcodes for a barcode to be considered a match.") public int MIN_MISMATCH_DELTA -
MAX_NO_CALLS
@Argument(doc="Maximum allowable number of no-calls in a barcode read before it is considered unmatchable.") public int MAX_NO_CALLS -
MINIMUM_BASE_QUALITY
@Argument(shortName="Q", doc="Minimum base quality. Any barcode bases falling below this quality will be considered a mismatch even if the bases match.") public int MINIMUM_BASE_QUALITY -
MINIMUM_QUALITY
@Argument(doc="The minimum quality (after transforming 0s to 1s) expected from reads. If qualities are lower than this value, an error is thrown. The default of 2 is what the Illumina\'s spec describes as the minimum, but in practice the value has been observed lower.") public int MINIMUM_QUALITY -
LANE
-
READ_STRUCTURE
@Argument(doc="A description of the logical structure of clusters in an Illumina Run, i.e. a description of the structure IlluminaBasecallsToSam assumes the data to be in. It should consist of integer/character pairs describing the number of cycles and the type of those cycles (B for Sample Barcode, M for molecular barcode, T for Template, and S for skip). E.g. If the input data consists of 80 base clusters and we provide a read structure of \"28T8M8B8S28T\" then the sequence may be split up into four reads:\n* read one with 28 cycles (bases) of template\n* read two with 8 cycles (bases) of molecular barcode (ex. unique molecular barcode)\n* read three with 8 cycles (bases) of sample barcode\n* 8 cycles (bases) skipped.\n* read four with 28 cycles (bases) of template\nThe skipped cycles would NOT be included in an output SAM/BAM file or in read groups therein.", shortName="RS") public String READ_STRUCTURE -
COMPRESS_OUTPUTS
@Argument(shortName="GZIP", doc="Compress output FASTQ files using gzip and append a .gz extension to the file names.") public boolean COMPRESS_OUTPUTS -
BASECALLS_DIR
-
METRICS_FILE
@Argument(doc="Per-barcode and per-lane metrics written to this file.", shortName="M", optional=true) public File METRICS_FILE -
INPUT_PARAMS_FILE
@Argument(doc="The input file that defines parameters for the program. This is the BARCODE_FILE for `ExtractIlluminaBarcodes` or the MULTIPLEX_PARAMS or LIBRARY_PARAMS file for `IlluminaBasecallsToFastq` or `IlluminaBasecallsToSam`", optional=true) public File INPUT_PARAMS_FILE -
BARCODE_COLUMN
Column header for the first barcode sequence (preferred).- See Also:
-
BARCODE_SEQUENCE_COLUMN
- See Also:
-
BARCODE_NAME_COLUMN
Column header for the barcode name.- See Also:
-
LIBRARY_NAME_COLUMN
Column header for the library name.- See Also:
-
BARCODE_PREFIXES
-
barcodeToMetrics
-
bclQualityEvaluationStrategy
-
noMatchMetric
-
inputReadStructure
The read structure of the actual Illumina Run, i.e. the readStructure of the input data
-
-
Constructor Details
-
ExtractBarcodesProgram
public ExtractBarcodesProgram()
-
-
Method Details
-
createBarcodeExtractor
-
customCommandLineValidation
Parses all barcodes from input files and validates all barcodes are the same length and unique- Overrides:
customCommandLineValidation
in classCommandLineProgram
- Returns:
- null if command line is valid. If command line is invalid, returns an array of error message to be written to the appropriate place.
-
collectErrorMessages
-
outputMetrics
protected void outputMetrics() -
finalizeMetrics
public static void finalizeMetrics(Map<String, BarcodeMetric> barcodeToMetrics, BarcodeMetric noMatchMetric) -
parseInputFile
protected static htsjdk.samtools.util.Tuple<Map<String,BarcodeMetric>, parseInputFileList<String>> (File inputFile, ReadStructure readStructure) Parses any one of the following types of files: ExtractIlluminaBarcodes BARCODE_FILE IlluminaBasecallsToFastq MULTIPLEX_PARAMS IlluminaBasecallsToSam LIBRARY_PARAMS This will validate to file format as well as populate a Map of barcodes to metrics.- Parameters:
inputFile
- The input file that is being parsedreadStructure
- The read structure for the reads of the run
-