![]() |
Eclipse SUMO - Simulation of Urban MObility
|
Output formatter for Parquet output. More...
#include <ParquetFormatter.h>
Public Member Functions | |
| bool | closeTag (std::ostream &into, const std::string &comment="") |
| Closes the most recently opened tag. | |
| OutputFormatterType | getType () |
| Returns the type of formatter being used. | |
| void | openTag (std::ostream &into, const std::string &xmlElement) |
| Keeps track of an open XML tag by adding a new element to the stack. | |
| void | openTag (std::ostream &into, const SumoXMLTag &xmlElement) |
| Keeps track of an open XML tag by adding a new element to the stack. | |
| ParquetFormatter (const std::string &columnNames, const std::string &compression="", const int batchSize=1000000) | |
| Constructor. | |
| void | setExpectedAttributes (const SumoXMLAttrMask &expected, const int depth=2) |
| Set the expected attributes to write. This is used for tracking which attributes are expected in table like outputs. This should be not necessary but at least in the initial phase of implementing CSV and Parquet it helps a lot to track errors. | |
| template<> | |
| void | writeAttr (std::ostream &, const std::string &attr, const int &val) |
| template<class T > | |
| void | writeAttr (std::ostream &, const std::string &attr, const T &val) |
| template<> | |
| void | writeAttr (std::ostream &, const SumoXMLAttr attr, const int &val, const bool isNull) |
| template<class T > | |
| void | writeAttr (std::ostream &, const SumoXMLAttr attr, const T &val, const bool isNull=false) |
| writes a named attribute | |
| template<> | |
| void | writeAttr (std::ostream &into, const std::string &attr, const double &val) |
| template<> | |
| void | writeAttr (std::ostream &into, const SumoXMLAttr attr, const double &val, const bool isNull) |
| virtual void | writePadding (std::ostream &into, const std::string &val) |
| Writes some whitespace to format the output. This method is only implemented for XML output. | |
| virtual void | writePreformattedTag (std::ostream &into, const std::string &val) |
| Writes a preformatted tag to the device but ensures that any pending tags are closed. This method is only implemented for XML output. | |
| void | writeTime (std::ostream &into, const SumoXMLAttr attr, const SUMOTime val) |
| virtual bool | writeXMLHeader (std::ostream &into, const std::string &rootElement, const std::map< SumoXMLAttr, std::string > &attrs, bool writeMetadata, bool includeConfig) |
| Writes an XML header with optional configuration. | |
| bool | wroteHeader () const |
| Returns whether a header has been written. Useful to detect whether a file is being used by multiple sources. | |
| virtual | ~ParquetFormatter () |
| Destructor. | |
Private Member Functions | |
| void | checkAttr (const SumoXMLAttr attr) |
| const std::string | getAttrString (const std::string &attrString) |
Private Attributes | |
| const int | myBatchSize |
| the number of rows to write per batch | |
| std::vector< std::shared_ptr< arrow::ArrayBuilder > > | myBuilders |
| the content array builders for the table | |
| bool | myCheckColumns = false |
| whether the columns should be checked for completeness | |
| parquet::Compression::type | myCompression = parquet::Compression::UNCOMPRESSED |
| the compression to use | |
| std::string | myCurrentTag |
| the currently read tag (only valid when generating the header) | |
| SumoXMLAttrMask | myExpectedAttrs |
| the attributes which are expected for a complete row (including null values) | |
| const std::string | myHeaderFormat |
| the format to use for the column names | |
| int | myMaxDepth = 0 |
| the maximum depth of the XML hierarchy | |
| std::unique_ptr< parquet::arrow::FileWriter > | myParquetWriter |
| the output stream writer | |
| std::shared_ptr< arrow::Schema > | mySchema = arrow::schema({}) |
| the table schema | |
| SumoXMLAttrMask | mySeenAttrs |
| the attributes already seen (including null values) | |
| const OutputFormatterType | myType |
| the type of formatter being used (XML, CSV, Parquet, etc.) | |
| std::vector< std::shared_ptr< arrow::Scalar > > | myValues |
| the current attribute / column values | |
| bool | myWroteHeader = false |
| whether the schema has been constructed completely | |
| std::vector< int > | myXMLStack |
| The number of attributes in the currently open XML elements. | |
Output formatter for Parquet output.
Definition at line 65 of file ParquetFormatter.h.
| ParquetFormatter::ParquetFormatter | ( | const std::string & | columnNames, |
| const std::string & | compression = "", |
||
| const int | batchSize = 1000000 |
||
| ) |
Constructor.
Definition at line 83 of file ParquetFormatter.cpp.
References myCompression, WRITE_ERRORF, and WRITE_WARNINGF.
|
inlinevirtual |
Destructor.
Definition at line 72 of file ParquetFormatter.h.
|
inlineprivate |
Definition at line 161 of file ParquetFormatter.h.
References myCheckColumns, myExpectedAttrs, myMaxDepth, mySeenAttrs, myXMLStack, TLF, and toString().
Referenced by writeAttr(), writeAttr(), and writeAttr().
|
virtual |
Closes the most recently opened tag.
| [in] | into | The output stream to use |
Implements OutputFormatter.
Definition at line 131 of file ParquetFormatter.cpp.
References myBatchSize, myBuilders, myCheckColumns, myCompression, myExpectedAttrs, myMaxDepth, myParquetWriter, mySchema, mySeenAttrs, myValues, myWroteHeader, myXMLStack, toString(), WRITE_ERRORF, and WRITE_WARNING.
|
inlineprivate |
Definition at line 146 of file ParquetFormatter.h.
References myCurrentTag, myHeaderFormat, and mySchema.
Referenced by writeAttr(), writeAttr(), writeAttr(), writeAttr(), writeAttr(), writeAttr(), and writeTime().
|
inlineinherited |
Returns the type of formatter being used.
Definition at line 150 of file OutputFormatter.h.
References OutputFormatter::myType.
Referenced by OutputDevice::writeAttr(), OutputDevice::writeAttr(), OutputDevice::writeFuncAttr(), and OutputDevice::writeOptionalAttr().
|
virtual |
Keeps track of an open XML tag by adding a new element to the stack.
| [in] | into | The output stream to use (unused) |
| [in] | xmlElement | Name of element to open (unused) |
Implements OutputFormatter.
Definition at line 107 of file ParquetFormatter.cpp.
References myCurrentTag, myMaxDepth, myValues, myWroteHeader, myXMLStack, and WRITE_WARNINGF.
|
virtual |
Keeps track of an open XML tag by adding a new element to the stack.
| [in] | into | The output stream to use (unused) |
| [in] | xmlElement | Name of element to open (unused) |
Implements OutputFormatter.
Definition at line 119 of file ParquetFormatter.cpp.
References myCurrentTag, myMaxDepth, myValues, myWroteHeader, myXMLStack, toString(), and WRITE_WARNINGF.
|
inlinevirtual |
Set the expected attributes to write. This is used for tracking which attributes are expected in table like outputs. This should be not necessary but at least in the initial phase of implementing CSV and Parquet it helps a lot to track errors.
| [in] | expected | which attributes are to be written (at the deepest XML level) |
| [in] | depth | the maximum XML hierarchy depth (excluding the root) |
Reimplemented from OutputFormatter.
Definition at line 139 of file ParquetFormatter.h.
References myCheckColumns, myExpectedAttrs, and myMaxDepth.
|
inline |
Definition at line 264 of file ParquetFormatter.h.
References getAttrString(), myBuilders, myCheckColumns, mySchema, myValues, and myWroteHeader.
|
inline |
Definition at line 114 of file ParquetFormatter.h.
References getAttrString(), myBuilders, myCheckColumns, mySchema, myValues, myWroteHeader, and toString().
|
inline |
Definition at line 236 of file ParquetFormatter.h.
References checkAttr(), getAttrString(), myBuilders, mySchema, myValues, myWroteHeader, and toString().
|
inline |
writes a named attribute
| [in] | attr | The attribute (name) |
| [in] | val | The attribute value |
| [in] | isNull | The given value is not set |
Definition at line 104 of file ParquetFormatter.h.
References checkAttr(), getAttrString(), myBuilders, mySchema, myValues, myWroteHeader, and toString().
Referenced by writeTime().
|
inline |
Definition at line 246 of file ParquetFormatter.h.
References getAttrString(), myBuilders, myCheckColumns, mySchema, myValues, and myWroteHeader.
|
inline |
Definition at line 218 of file ParquetFormatter.h.
References checkAttr(), getAttrString(), myBuilders, mySchema, myValues, myWroteHeader, SUMO_ATTR_X, SUMO_ATTR_Y, and toString().
|
inlinevirtualinherited |
Writes some whitespace to format the output. This method is only implemented for XML output.
| [in] | into | The output stream to use |
| [in] | val | The whitespace |
Reimplemented in PlainXMLFormatter.
Definition at line 136 of file OutputFormatter.h.
References UNUSED_PARAMETER.
Referenced by OutputDevice::writePadding().
|
inlinevirtualinherited |
Writes a preformatted tag to the device but ensures that any pending tags are closed. This method is only implemented for XML output.
| [in] | into | The output stream to use |
| [in] | val | The preformatted data |
Reimplemented in PlainXMLFormatter.
Definition at line 125 of file OutputFormatter.h.
References UNUSED_PARAMETER.
Referenced by OutputDevice::writePreformattedTag().
|
inlinevirtual |
Implements OutputFormatter.
Definition at line 123 of file ParquetFormatter.h.
References getAttrString(), gHumanReadableTime, myBuilders, mySchema, myValues, myWroteHeader, STEPS2TIME, time2string(), toString(), and writeAttr().
|
inlinevirtualinherited |
Writes an XML header with optional configuration.
If something has been written (myXMLStack is not empty), nothing is written and false returned. The default implementation does nothing and returns false.
| [in] | into | The output stream to use |
| [in] | rootElement | The root element to use |
| [in] | attrs | Additional attributes to save within the rootElement |
| [in] | includeConfig | whether the current config should be included as XML comment |
Reimplemented in PlainXMLFormatter.
Definition at line 77 of file OutputFormatter.h.
References UNUSED_PARAMETER.
Referenced by OutputDevice::writeXMLHeader().
|
inlinevirtual |
Returns whether a header has been written. Useful to detect whether a file is being used by multiple sources.
Implements OutputFormatter.
Definition at line 135 of file ParquetFormatter.h.
References myWroteHeader.
|
private |
the number of rows to write per batch
Definition at line 177 of file ParquetFormatter.h.
Referenced by closeTag().
|
private |
the content array builders for the table
Definition at line 189 of file ParquetFormatter.h.
Referenced by closeTag(), writeAttr(), writeAttr(), writeAttr(), writeAttr(), writeAttr(), writeAttr(), and writeTime().
|
private |
whether the columns should be checked for completeness
Definition at line 204 of file ParquetFormatter.h.
Referenced by checkAttr(), closeTag(), setExpectedAttributes(), writeAttr(), writeAttr(), and writeAttr().
|
private |
the compression to use
Definition at line 174 of file ParquetFormatter.h.
Referenced by closeTag(), and ParquetFormatter().
|
private |
the currently read tag (only valid when generating the header)
Definition at line 180 of file ParquetFormatter.h.
Referenced by getAttrString(), openTag(), and openTag().
|
private |
the attributes which are expected for a complete row (including null values)
Definition at line 207 of file ParquetFormatter.h.
Referenced by checkAttr(), closeTag(), and setExpectedAttributes().
|
private |
the format to use for the column names
Definition at line 171 of file ParquetFormatter.h.
Referenced by getAttrString().
|
private |
the maximum depth of the XML hierarchy
Definition at line 198 of file ParquetFormatter.h.
Referenced by checkAttr(), closeTag(), openTag(), openTag(), and setExpectedAttributes().
|
private |
the output stream writer
Definition at line 186 of file ParquetFormatter.h.
Referenced by closeTag().
|
private |
the table schema
Definition at line 183 of file ParquetFormatter.h.
Referenced by closeTag(), getAttrString(), writeAttr(), writeAttr(), writeAttr(), writeAttr(), writeAttr(), writeAttr(), and writeTime().
|
private |
the attributes already seen (including null values)
Definition at line 210 of file ParquetFormatter.h.
Referenced by checkAttr(), and closeTag().
|
privateinherited |
the type of formatter being used (XML, CSV, Parquet, etc.)
Definition at line 168 of file OutputFormatter.h.
Referenced by OutputFormatter::getType().
|
private |
the current attribute / column values
Definition at line 195 of file ParquetFormatter.h.
Referenced by closeTag(), openTag(), openTag(), writeAttr(), writeAttr(), writeAttr(), writeAttr(), writeAttr(), writeAttr(), and writeTime().
|
private |
whether the schema has been constructed completely
Definition at line 201 of file ParquetFormatter.h.
Referenced by closeTag(), openTag(), openTag(), writeAttr(), writeAttr(), writeAttr(), writeAttr(), writeAttr(), writeAttr(), writeTime(), and wroteHeader().
|
private |
The number of attributes in the currently open XML elements.
Definition at line 192 of file ParquetFormatter.h.
Referenced by checkAttr(), closeTag(), openTag(), and openTag().