Tutorial

The cfinterface framework is meant to be used when developing low-level interfaces to complex textual or binary files, where the explicit modeling of the file schema is a desired feature. The abstractions defined by the framework allow the user to divide the modeling of file schemas in meaningful pieces, enabling reuse of schemas and content while reading and writing files.

Three main file types are provided by the framework: BlockFile, SectionFile and RegisterFile. Each of these file models are meant to use in specific situations.

BlockFiles are files which can be modeled as blocks of text (or bytes), which can be easily identified by a given beginning pattern and an optional ending pattern. These patterns can be given as regexes for the entity definition in the framework. The steps of reading a block are up to the user to define by overloading the read function.

SectionFiles are a special case of BlockFiles that don’t follow beginning or ending patterns, but can be divided in sections, which are direct divisions of the file content in a subset of lines or bytes. These are usually the simplest files, where the content does not vary in amount or contain repeating pieces.

RegisterFiles can be seen also as a special case of BlockFiles, where each block contain only one line. The actual implementation of the internals of a RegisterFile differ, however, from a BlockFile. Using the Line entity from the framework, RegisterFiles are made in a way that many registers can be defined with little overhead from the developer, allowing to model an extensive set of patterns with little code maintenance.

Each of the files, together with some additional details on another abstractions provided by the framework, will be briefly shown in the following. For more details on each approach, check on the examples page. If your use case is not actually covered, please contribute with an Issue.

Fields

The most fundamental components in the cfinterface framework are the Fields. Being defined for both textual and binary interfaces, fields are containers for one specific data value, with a specific formatting, located in a file. The common base Field class is used for defining generic reading and writing functions, which implementations are given by each specific subclass.

Each specific Field contain its own arguments. For instance, if the file being modeled contains a line in which a specific literal data is desired:

1 |   Username (max. 30 chars)   | ...
2 |   FooBar                     | ...

One can see that the Username begins in the second column and will contain at most 30 characters. So, one can define a LiteralField that will extract, format and store the content:

1 from cfinterface import LiteralField
2
3 username = LiteralField(size=30, starting_position=1)
4
5 value = username.read("|   FooBar                     | ...")
6 # The content "FooBar" can be accessed by both value and username.value
7 assert value == "FooBar"
8 assert username.value == "FooBar"

Other fields are used for storing numeric values, such as IntegerField and FloatField. The DatetimeField is used specifically for constructing an datetime object directly from the file contents following one or more format strings.

Line

Usually a line in a file contain more than just one piece of desired information. In this case, the Field component is not enough for being able to model the given line for reading and writing. In these cases, the Line component is the one suited for the task, being defined as a simple collection of fields. In the previous example, suppose that the actual file lines contain more than just the username:

1 |   Username (max. 30 chars)   | Signup Date | Age | Balance ($) | ...
2 |   FooBar                     |  2020-05-20 |  18 |       99.90 | ...

The line contents now are modeled by a list of fields, which define a Line:

 1 from datetime import datetime
 2
 3 from cfinterface import LiteralField, DatetimeField, IntegerField, FloatField, Line
 4
 5 username = LiteralField(size=30, starting_position=1)
 6 signup_date = DatetimeField(size=13, starting_position=32, format="%Y-%m-%d")
 7 age = IntegerField(size=5, starting_position=46)
 8 balance = FloatField(size=13, starting_position=52, decimal_digits=2)
 9
10 line = Line(fields=[username, signup_date, age, balance])
11
12 values = line.read("|   FooBar                     |  2020-05-20 |  18 |       99.90 | ...")
13 assert values == ["FooBar", datetime(2020, 5, 20), 18, 99.90]

Blocks and BlockFiles

Suppose there is a file which content resembles the following

 1MY_FIRST_BLOCK_BEGINNING_PATTERN
 2I have some raw text for describing my content, because I was mean for being irectly read by someone.
 3
 4Now I have some data. After which I will be done.
 5
 6Date       Index     Value
 72020/01        1    1000.0
 82020/02        1    2000.0
 92020/01        2    3000.0
102020/02        2    4000.0
112020/01        3    5000.0
122020/02        3    6000.0
13MY_FIRST_BLOCK_ENDING_PATTERN
14
15MY_SECOND_BLOCK_BEGINNING_PATTERN
16My content is completely different from the previous block...
17  Username    Last Login
18     admin    1996/01/01
19  sunshine    2000/01/01
20 pineapple    1996/01/01
21     admin    1996/01/01
22MY_SECOND_BLOCK_ENDING_PATTERN
23
24MY_FIRST_BLOCK_BEGINNING_PATTERN
25...
26MY_FIRST_BLOCK_ENDING_PATTERN
27
28MY_FIRST_BLOCK_BEGINNING_PATTERN
29...
30MY_FIRST_BLOCK_ENDING_PATTERN
31
32MY_SECOND_BLOCK_BEGINNING_PATTERN
33...
34MY_SECOND_BLOCK_ENDING_PATTERN
35...

One may notice that the file is composed of two blocks of content that have clear beginning and ending patterns, but are written without an specific order in the file. Even the number of repetitions of both blocks cannot be discovered without parsing the whole file at least once. In this case, a BlockFile is the best approach for modeling it.

One possible approach for modeling the file using the BlockFile abstraction is:

  1 from typing import IO
  2 import pandas as pd
  3
  4 from cfinterface import IntegerField, FloatField, DatetimeField, LiteralField
  5 from cfinterface import Line, Block, BlockFile
  6
  7 class FirstBlock(Block):
  8
  9     __slots__ = [
 10         "__header_lines",
 11         "__line_model",
 12     ]
 13
 14     BEGIN_PATTERN = "MY_FIRST_BLOCK_BEGINNING_PATTERN"
 15     END_PATTERN = "MY_FIRST_BLOCK_ENDING_PATTERN"
 16
 17     NUM_HEADER_LINES = 5
 18
 19     def __init__(self, previous=None, next=None, data=None) -> None:
 20         super().__init__(previous, next, data)
 21         self.__header_lines = []
 22         date_field = DatetimeField(size=7, starting_position=0, format="%y/%m")
 23         index_field = IntegerField(size=4, starting_position=11)
 24         value_field = FloatField(size=6, starting_position=19)
 25         self.__line_model = Line([date_field, index_field, value_field])
 26
 27     def __eq__(self, o: object) -> bool:
 28         if not isinstance(o, FirstBlock):
 29             return False
 30         block: FirstBlock = o
 31         if not all(
 32             [
 33                 isinstance(self.data, pd.DataFrame),
 34                 isinstance(block.data, pd.DataFrame),
 35             ]
 36         ):
 37             return False
 38         else:
 39             return self.data.equals(block.data)
 40
 41     # Override
 42     def read(self, file: IO, *args, **kwargs) -> bool:
 43
 44         # Discards the line with the beginning pattern
 45         file.readline()
 46
 47         # Reads header lines for writing later
 48         for _ in range(self.__class__.NUM_HEADER_LINES):
 49             self.__header_lines.append(file.readline())
 50
 51         # Reads the data content
 52         dates = []
 53         indices = []
 54         values = []
 55
 56         while True:
 57
 58             line = file.readline()
 59             if FirstBlock.ends(line):
 60                 self.data = pd.DataFrame({"Date": dates, "Index": indices, "Value": values})
 61                 break
 62
 63             date, index, value = self.__line_model.read(line)
 64             dates.append(date)
 65             indices.append(index)
 66             values.append(value)
 67
 68     # Override
 69     def write(self, file: IO, *args, **kwargs):
 70
 71         file.write(self.__class__.BEGIN_PATTERN + "\n")
 72
 73         # Writes header lines
 74         for line in self.__header_lines:
 75             file.write(line)
 76
 77         # Writes data lines
 78         for _, line in self.data.iterrows():
 79             self.__line_model.write([line["Date"], line["Index"], line["Value"]])
 80
 81         file.write(self.__class__.END_PATTERN + "\n")
 82
 83
 84 class SecondBlock(Block):
 85     # Implement in a similar way for the second block specifics
 86
 87
 88 class MyBlockFile(BlockFile):
 89
 90     BLOCKS = [
 91         FirstBlock,
 92         SecondBlock,
 93     ]
 94
 95     # All the reading and writing logic is done by the framework,
 96     # finding when each block begins and calling their implemetations.
 97     # The user can implement some properties for better suiting its use cases:
 98
 99     @property
100     def first_block_data(self) -> pd.DataFrame:
101         block_dfs = [b.data for b in self.data.get_blocks_of_type(FirstBlock)]
102         return pd.concat(block_dfs, ignore_index=True)
103
104 file = MyBlockFile.read("/path/to/file_describe_above.txt")
105 assert type(file.first_block_data) == pd.DataFrame
106 file.write("/path/to/some_other_desired_file.txt")
107 # The content of the written file should be the same
108 # as the source file

As one can see, the read and write methods are implemented in a generic way in the base BlockFile class, and will deal with any of the block types informed in the BLOCKS class attribute. However, each Block must implement its own read and write methods, which will be called when the BlockFile class successfully matches one of the BEGIN_PATTERN expressions. All the blocks that were successully read will be stored in the data field, accessible inside the built file object. This is a BlockData object, which implements a double linked list of blocks that were parsed from the given file.

The developer may edit any of the desired blocks or any of its fields. When calling the write function, all blocks will be written to the file, following the login of its own write function, implemented by the developer.

Any content in the file that was not matched as being in any of the given blocks is stored as an instance of the DefaultBlock object, which is an one-line block for reproducing the entire file contents when writing it back.

Registers and RegisterFiles

A special case of blocks in a file is when the length of each block is exactly 1. In this case, each block is a single line, and defining all the requirements of the Block + BlockFile approach can be a little too much.

For handling this special case, the developer can use another approach, which is defined by RegisterFile abstraction, together with the Register components.

Suppose there is a file with the following content

1DATA_HIGH  ID001   sudo.user  10/20/2025  901.25
2DATA_HIGH  ID002   sudo.user  10/21/2025  100.20
3DATA_HIGH  ID003   test.user  10/30/2025  100.20
4
5DATA_LOW   01/01/2024   105.23
6DATA_LOW   01/02/2024    29.15
7DATA_LOW   01/03/2024     5.05

In this case, each line is defined by an unique beginning pattern in its first columns, together with a set of fields that are positioned on different places depending on the beginning pattern.

Each pattern determines a different Register class to be built, and the entire file can have a variable number of objects for each register.

One possible approach for modeling the file using the RegisterFile abstraction is:

  1 from datetime import datetime
  2
  3 from cfinterface import IntegerField, FloatField, DatetimeField, LiteralField
  4 from cfinterface import Line, Register, RegisterFile
  5
  6 class DataHigh(Register):
  7     IDENTIFIER = "DATA_HIGH"
  8     IDENTIFIER_DIGITS = 9
  9     LINE = Line(
 10         [
 11             LiteralField(size=6, starting_position=11),
 12             LiteralField(size=9, starting_position=19),
 13             DatetimeField(size=10, starting_position=30, format="%m/%d/%Y"),
 14             FloatField(size=6, starting_position=42, decimal_digits=2),
 15         ]
 16     )
 17
 18     @property
 19     def field_id(self) -> str:
 20         """
 21         Identifier of the DataHigh element.
 22         """
 23         return self.data[0]
 24
 25     @field_id.setter
 26     def field_id(self, v: str):
 27         self.data[0] = v
 28
 29     @property
 30     def user(self) -> str:
 31         """
 32         User associated with the DataHigh element.
 33         """
 34         return self.data[1]
 35
 36     @user.setter
 37     def user(self, v: str):
 38         self.data[1] = v
 39
 40     @property
 41     def date(self) -> datetime:
 42         """
 43         Date associated with the DataHigh element.
 44         """
 45         return self.data[2]
 46
 47     @date.setter
 48     def date(self, v: datetime):
 49         self.data[2] = v
 50
 51     @property
 52     def value(self) -> float:
 53         """
 54         Value associated with the DataHigh element.
 55         """
 56         return self.data[3]
 57
 58     @value.setter
 59     def value(self, v: float):
 60         self.data[3] = v
 61
 62
 63 class DataLow(Register):
 64     IDENTIFIER = "DATA_LOW"
 65     IDENTIFIER_DIGITS = 8
 66     LINE = Line(
 67         [
 68             DatetimeField(size=10, starting_position=11, format="%m/%d/%Y"),
 69             FloatField(size=6, starting_position=24, decimal_digits=2),
 70         ]
 71     )
 72
 73
 74     @property
 75     def date(self) -> datetime:
 76         """
 77         Date associated with the DataLow element.
 78         """
 79         return self.data[0]
 80
 81     @date.setter
 82     def date(self, v: datetime):
 83         self.data[0] = v
 84
 85     @property
 86     def value(self) -> float:
 87         """
 88         Value associated with the DataLow element.
 89         """
 90         return self.data[1]
 91
 92     @value.setter
 93     def value(self, v: float):
 94         self.data[1] = v
 95
 96
 97 class MyRegisterFile(RegisterFile):
 98
 99     REGISTERS = [
100         DataHigh,
101         DataLow,
102     ]
103
104     # All the reading and writing logic is done by the framework,
105     # finding which register is in each line and calling their implemetations.
106     # The user can implement some properties for better suiting its use cases:
107
108     @property
109     def data_high(self) -> DataHigh | list[DataHigh] | None:
110         return self.data.get_registers_of_type(DataHigh)
111
112     @property
113     def data_low(self) -> DataLow | list[DataLow] | None:
114         return self.data.get_registers_of_type(DataLow)
115
116 file = MyRegisterFile.read("/path/to/file_describe_above.txt")
117 assert len(file.data_high) == 3
118 assert file.data_high[0].field_id == "ID001"
119 file.write("/path/to/some_other_desired_file.txt")
120 # The content of the written file should be the same
121 # as the source file

As one can see, the read and write methods are implemented in a generic way in the base RegisterFile class, and will deal with any of the register types informed in the REGISTER class attribute. All the registers that were successully read will be stored in the data field, accessible inside the built file object. This is a RegisterData object, which implements a double linked list of registers that were parsed from the given file.

The developer may edit any of the desired registers or any of its fields. When calling the write function, all registers will be written to the file, following the formatting of each of field.

Any content in the file that was not matched as being in any of the given registers is stored as an instance of the DefaultRegister object, which is an one-field register for reproducing the entire file contents when writing it back.

Sections and SectionFiles

Another special case of blocks in a file is when the beginning pattern of each block does not matter. In this case, the file to be modeled is usually well determined in terms of content and ordering. However, if the developer also models others files using the BlockFile and RegisterFile approaches and wants to maintain all the files in the same framework, the Section and SectionFile can be used. Also, following the framework allows versioning each file part separately, which can be useful for schemas that change over time.

Suppose there is a file with the following content

 1Date       Index     Value
 2----------------------------
 32020/01        1    1000.0
 42020/02        1    2000.0
 52020/01        2    3000.0
 62020/02        2    4000.0
 72020/01        3    5000.0
 82020/02        3    6000.0
 9----------------------------
10
11  Username    Last Login
12-------------------------
13     admin    1996/01/01
14  sunshine    2000/01/01
15 pineapple    1996/01/01
16     admin    1996/01/01
17-------------------------

If the file is such that always these two tables will be exhibited, in the same order, and there is no chance of repeating these information blocks, the SectionFile approach can be used. Also, one may note that there is no such clear beginning and ending patterns like the previous BlockFile example.

One possible approach for modeling the file using the SectionFile abstraction is:

 1 from typing import IO
 2 import pandas as pd
 3
 4 from cfinterface import IntegerField, FloatField, DatetimeField, LiteralField
 5 from cfinterface import Line, Section, SectionFile
 6
 7 class FirstSection(Section):
 8
 9     __slots__ = [
10         "__line_model",
11     ]
12
13     HEADER_LINE = "Date       Index     Value"
14     MARGIN_LINE = "----------------------------"
15
16     def __init__(self, previous=None, next=None, data=None) -> None:
17         super().__init__(previous, next, data)
18         date_field = DatetimeField(size=7, starting_position=0, format="%y/%m")
19         index_field = IntegerField(size=4, starting_position=11)
20         value_field = FloatField(size=6, starting_position=19)
21         self.__line_model = Line([date_field, index_field, value_field])
22
23     def __eq__(self, o: object) -> bool:
24         if not isinstance(o, FirstSection):
25             return False
26         block: FirstSection = o
27         if not all(
28             [
29                 isinstance(self.data, pd.DataFrame),
30                 isinstance(block.data, pd.DataFrame),
31             ]
32         ):
33             return False
34         else:
35             return self.data.equals(block.data)
36
37     # Override
38     def read(self, file: IO, *args, **kwargs) -> bool:
39
40         # Discards the line with the header and margin
41         for _ in range(2):
42             file.readline()
43
44         # Reads the data content
45         dates = []
46         indices = []
47         values = []
48
49         while True:
50
51             line = file.readline()
52             if self.MARGIN_LINE in line:
53                 self.data = pd.DataFrame({"Date": dates, "Index": indices, "Value": values})
54                 break
55
56             date, index, value = self.__line_model.read(line)
57             dates.append(date)
58             indices.append(index)
59             values.append(value)
60
61     # Override
62     def write(self, file: IO, *args, **kwargs):
63
64         file.write(self.HEADER_LINE + "\n")
65         file.write(self.MARGIN_LINE + "\n")
66
67         # Writes data lines
68         for _, line in self.data.iterrows():
69             self.__line_model.write([line["Date"], line["Index"], line["Value"]])
70
71         file.write(self.MARGIN_LINE + "\n")
72
73
74 class SecondSection(Section):
75     # Implement in a similar way for the second section specifics
76
77
78 class MySectionFile(SectionFile):
79
80     SECTIONS = [
81         FirstSection,
82         SecondSection,
83     ]
84
85     # All the reading and writing logic is done by the framework.
86     # The user can implement some properties for better suiting its use cases:
87
88     @property
89     def first_section_data(self) -> pd.DataFrame:
90         s = self.data.get_sections_of_type(FirstSection)
91         return s.data
92
93 file = MySectionFile.read("/path/to/file_described_above.txt")
94 assert type(file.first_section_data) == pd.DataFrame
95 file.write("/path/to/some_other_desired_file.txt")
96 # The content of the written file should be the same
97 # as the source file

As one can see, the read and write methods are implemented in a generic way in the base SectionFile class, and will call the specific section functions in the same order that they were declared in the SECTION class attribute. All the sections that were successully read will be stored in the data field, accessible inside the built file object. This is a SectionData object, which implements a double linked list of sections that were parsed from the given file.

The developer may edit any of the desired sections or any of its fields. When calling the write function, all sections will be written to the file, following the formatting of each of field.

All data that may exist in the file after the last modeled section will be read as DefaultSection objects. These are one-line sections used for compatibility with data not explicitly modeled by the developer.

File Encodings

Currently, when modeling a file through any of the aforementioned approaches, the developer can choose the preferred encoding for reading and writing. Also, instead of a single encoding, a list of encodings can be supplied, which will be used for reading an writing.

Some ways for specifying encodings are:

 1 from cfinterface import BlockFile
 2
 3 class MyBlockFileWithSingleEncoding(BlockFile):
 4
 5     ENCODING = "utf-8"
 6
 7
 8 class MyBlockFileWithManyEncodings(BlockFile):
 9
10     ENCODING = ["utf-8", "latin-1", "ascii"]

When reading, each of the supplied encodings will be used, in order. The first encoding to successfully parse the whole file will end the reading process. For writing, the file model will always use the first encoding of the list.

Modeling Binary Files

When a file contains data encoded in binary format instead of textual, the cfinterface framework is still applied for modeling its contents, supporting reading and writing. The same file models can be used, but with some differences in the meaning of some fundamental actions, which are better illustrated in the examples page.

For defining a file model as binary, use the StorageType enum:

 1 from cfinterface import BlockFile
 2 from cfinterface.storage import StorageType
 3
 4 class MyTextualFile(BlockFile):
 5
 6     STORAGE = StorageType.TEXT
 7
 8 class MyBinaryFile(BlockFile):
 9
10     STORAGE = StorageType.BINARY

Note

Bare string values (STORAGE = "TEXT" or STORAGE = "BINARY") still work for backward compatibility, but emit a DeprecationWarning. Prefer StorageType.TEXT and StorageType.BINARY in new code.

Versioning Files

Files can change their schema with time, resulting in multiple versions. One approach is to define multiple file models, but this could result in large amounts of copied and pasted code, since the changes in the schemas could be minimal.

The cfinterface supports file versioning by allowing the user to define the lists of elements (Blocks, Registers or Sections) that exist on each version of the file.

For an example, suppose there is a file that was versioned. The file always contained two blocks, but one of them had a small change on its schema when the file version evolved from version 1.0 to 2.0. The VERSIONS dict maps version keys to their component lists:

 1 from cfinterface import Block, BlockFile
 2
 3 class MyConstantBlock(Block):
 4     pass
 5
 6 class MyVersionedOldBlock(Block):
 7     pass
 8
 9 class MyVersionedNewBlock(Block):
10     pass
11
12 class MyVersionedFile(BlockFile):
13
14     VERSIONS = {
15         "1.0": [
16             MyConstantBlock,
17             MyVersionedOldBlock,
18         ],
19         "2.0": [
20             MyConstantBlock,
21             MyVersionedNewBlock,
22         ],
23     }

To read a specific version, pass the version keyword to read():

1 old_file = MyVersionedFile.read("path/to/old/file", version="1.0")
2 new_file = MyVersionedFile.read("path/to/new/file", version="2.0")

When no version is given, the default BLOCKS list is used (typically the latest schema). If version is given but has no exact match in VERSIONS, the framework resolves it to the latest available version W such that W <= version (lexicographic comparison). For example, requesting version "1.5" would resolve to "1.0" in the example above.

The same version keyword is available on read_many() for batch reading:

1 files = MyVersionedFile.read_many(
2     ["path/to/file1", "path/to/file2"],
3     version="2.0",
4 )

You can also use resolve_version() directly for custom version logic:

1 from cfinterface import resolve_version
2
3 components = resolve_version("1.5", MyVersionedFile.VERSIONS)
4 # Returns the component list for version "1.0"

After reading, you can validate whether the parsed data matches the expected version schema using validate_version():

 1 from cfinterface import validate_version
 2 from cfinterface.components.defaultblock import DefaultBlock
 3
 4 file = MyVersionedFile.read("path/to/file", version="2.0")
 5 result = file.validate(version="2.0")
 6
 7 # result is a VersionMatchResult with fields:
 8 # matched, expected_types, found_types, missing_types,
 9 # unexpected_types, default_ratio
10 if not result.matched:
11     print(f"Missing types: {result.missing_types}")
12     print(f"Default ratio: {result.default_ratio:.1%}")

Deprecated since version The: set_version() class method is deprecated and will be removed in a future release. It mutates class-level state, which can cause issues in concurrent or multi-version workflows. Use read(path, version="...") instead.

Tabular Data

For files where data is organized in a regular tabular layout — fixed-width columns or delimiter-separated fields — the TabularParser class provides a higher-level API built on top of Line and Field.

A tabular schema is declared with a list of ColumnDef named tuples, each mapping a column name to its Field instance:

 1 from cfinterface import IntegerField, FloatField, LiteralField
 2 from cfinterface.components.tabular import ColumnDef, TabularParser
 3
 4 columns = [
 5     ColumnDef(name="City", field=LiteralField(size=12, starting_position=0)),
 6     ColumnDef(name="Population", field=IntegerField(size=10, starting_position=12)),
 7     ColumnDef(name="Area", field=FloatField(size=8, starting_position=22, decimal_digits=1)),
 8 ]
 9
10 parser = TabularParser(columns)

Once created, the parser can convert raw text lines into a dict-of-lists and back:

 1 lines = [
 2     "Springfield  1200000    115.4\n",
 3     "Shelbyville   800000     98.2\n",
 4 ]
 5
 6 data = parser.parse_lines(lines)
 7 # data == {"City": ["Springfield", "Shelbyville"],
 8 #          "Population": [1200000, 800000],
 9 #          "Area": [115.4, 98.2]}
10
11 roundtrip = parser.format_rows(data)
12 # roundtrip produces lines with the same fixed-width layout

For delimiter-separated data (CSV-style), pass the delimiter parameter:

1 delimited_columns = [
2     ColumnDef(name="Name", field=LiteralField(size=20, starting_position=0)),
3     ColumnDef(name="Score", field=FloatField(size=10, starting_position=0, decimal_digits=2)),
4 ]
5
6 csv_parser = TabularParser(delimited_columns, delimiter=",")
7 data = csv_parser.parse_lines(["Alice,95.50\n", "Bob,87.25\n"])
8 # data == {"Name": ["Alice", "Bob"], "Score": [95.50, 87.25]}

Note

When using delimiter, the starting_position of each field is ignored — only size (maximum token width) applies.

For files where a full section is tabular, the TabularSection base class combines TabularParser with the Section read/write lifecycle:

 1 from cfinterface import IntegerField, FloatField, LiteralField
 2 from cfinterface import SectionFile
 3 from cfinterface.components.tabular import ColumnDef, TabularSection
 4
 5 class ScoreSection(TabularSection):
 6     COLUMNS = [
 7         ColumnDef(name="Name", field=LiteralField(size=15, starting_position=0)),
 8         ColumnDef(name="Score", field=IntegerField(size=5, starting_position=15)),
 9     ]
10     HEADER_LINES = 2
11     END_PATTERN = r"^---"
12
13 class ScoreFile(SectionFile):
14     SECTIONS = [ScoreSection]

If pandas is installed (pip install cfinterface[pandas]), you can convert the parsed dict-of-lists to a DataFrame:

1 df = TabularParser.to_dataframe(data)