Introducing TinyFlux: The Tiny Time Series Database for Python-based IoT & Analytics Applications

5 min readJul 19, 2022

While datasets come in a nearly infinite number of shapes and sizes, the same cannot be said for data stores. Sure- any great piece of software should be able to handle a range of use-cases from small to large, but between bare-bones text files and unwieldy standalone database servers there exists a lack of options for querying and storing data in a user-friendly way.

TinyDB by Markus Siemens, however, wonderfully occupies this niche in the Python ecosystem for document-like datasets. TinyDB is a lightweight, open-source Python package that provides the API and functionality of a document-oriented datastore with the simplicity of flat, human-readable files. If your dataset can be represented as key/value pairs and you won’t be working with gigabytes of data in a distributed manner, TinyDB is a stellar package that you can integrate into your Python workflow as a bonafide database in mere seconds.

One of TinyDB’s greatest strengths (as well as one of its greatest limitations) is that it is backed in most of its use-cases by the JSON file format. Human-readability, cross-platform interoperability, and a semblance to Python’s dictionary object are all great benefits of JSON, but due in part to its syntax rules (the need for enclosing brackets, braces…), the file — and by proxy, the database — cannot be appended to without repeatedly reading the entire database into memory, making writes an increasingly costly operation.

As a software engineer that works often with IoT sensor data (and a hobbyist Arduino programmer that works solely with IoT sensor data), I often have a need to work with timestamps as well as to perform high-frequency writes. These are core features of time series data and while TinyDB is versatile, it is not designed for this kind of dataset- one in which keys are timestamps, observations must be sorted, and writes occur much more frequently then reads (often at a high frequency).

It is for this reason that I designed TinyFlux upon the same pillar as TinyDB, while prioritizing for time series datasets. Like TinyDB, it is lightweight, contains an ORM-like query syntax, is backed by a text file (CSV in this case), allows for flexible schemas, and is fun to use. Unlike TinyDB, timestamps are first-class data types (specifically, the tricky Python datetime object) and the time it takes to perform writes never increases, regardless of the size of the underlying file.

Repeated Insert Test (TinyDB is using JSON storage while TinyFlux is using CSV storage)

To be sure, TinyDB has and will continue to have its place as a fun, easy-to-use database for document-like data (I currently use it in Dash applications for server-side user configuration). TinyFlux merely attempts to do for time series datasets what TinyDB already does for document-like datasets.

The “Flux” in TinyFlux: Integrating Time Series Concepts from InfluxDB

As of this writing, InfluxDB is among the most mature time series databases used in production. If you a need distributed, production-ready database for your high-frequency time series data, you’d be hard-pressed to find a reason to not consider InfluxDB (my teams have used it in several applications in which we need to analyze 10,000+ Hz embedded sensor streams and it performs quite well).

Performance aside, its key concepts and first-class objects specifically describe the components of a time series dataset (as opposed to say, a generic columnar dataset) leading to a pragmatic and logical query language that is no longer SQL, but is familiar enough for users to learn and use quickly in time series contexts. For this reason, TinyFlux borrows the key concepts and objects of InfluxDB for its own syntax, hence the -Flux suffix of the TinyFlux moniker.

As a quick illustration, the insert snippet below demonstrates how a “point”—one of the most important concepts in InfluxDB — is integrated into TinyFlux. A Point in InfluxDB is just an individual observation in a time series signal (akin to a “row” in a relational database), and it contains:

A timezone-aware timestamp
Numeric key/values known as fields
Textual key/values known as tags

Fields and tags are treated differently when it comes to indexing, sorting, and querying and thus they are separate but similar concepts.

The TinyFlux/InfluxDB concepts of points, tags, fields, and timestamps.

On the retrieval end, TinyFlux’s query language simply extends that of TinyDB to handle time series data, as in the instance of querying by timestamp:

TinyFlux query language extends that of TinyDB, for time objects.

By combining the simplicity of TinyDB with the concepts of InfluxDB, TinyFlux represents the middle ground in time series data stores in which you can enjoy the streamlined API of a mature database interface without the need for provisioning and maintaining new instances to host your data. It is nice to know that if you become a user of TinyFlux and you do indeed find yourself outgrowing the use-case that TinyFlux was designed for, a natural evolution is just an easy transition to InfluxDB away.

Getting Started with TinyFlux

Think TinyFlux is for you? Head over to the GitHub repository. TinyFlux is easily installed using PyPI (e.g. pip install tinyflux) and has been thoroughly tested on all modern versions of CPython (3.7 to 3.10) and PyPy 3.9.

Need some inspiration? Below are a couple of Jupyter Notebooks containing examples of two common use-cases of TinyFlux:

Finally, TinyFlux may not be for you- not everyone is analyzing time series data using Python (heathens!). However, if you feel like it could potentially be of use to somebody somewhere, please give the project a star on GitHub! I’ll buy you a drink and talk data with you if our trajectories ever cross in the future 🤞.

A data visualization of the California statewide Air Quality Index (AQI) in September 2020, made with TinyFlux and Plotly.

TinyFlux is currently in beta and welcoming any and all contributions/feedback. If you would like to contribute to TinyFlux, check out the README in the repository. Anything else? Drop me a line.

Introducing TinyFlux: The Tiny Time Series Database for Python-based IoT & Analytics Applications

The “Flux” in TinyFlux: Integrating Time Series Concepts from InfluxDB

Getting Started with TinyFlux

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Justin Fung

Responses (3)

More from Justin Fung

How I Used Slack to Optimize This Year’s Secret Santa So It Wasn’t Awkward for Anyone Involved 🎅

A full-proof recipe combining emojis, combinatorial optimization, and a dash of eggnog that didn’t make anyone call in sick this year.

How I Used Slack to Optimize This Year’s Secret Santa So It Wasn’t Awkward for Anyone Involved 🎅

A full-proof recipe combining emojis, combinatorial optimization, and a dash of eggnog that didn’t make anyone call in sick this year.

Recommended from Medium

The ESP32 and the Metriful MS430 Sensor a basic set up guide:

A step-by-step guide on how to wire the Metriful MS430 sensor to your ESP32.

Finding Patterns in Convenience Store Locations with Geospatial Association Rule Mining

Understanding spatial trends in the location of Tokyo convenience stores

Graph ML: Graph traversal algorithms in a nutshell

A quick glance at bread-first and depth-first search algorithms for graph machine learning

All About Python’s Paramiko Library

Paramiko is a popular Python library used for SSH connectivity. It allows for secure connections to remote servers, making it useful to…

This new IDE from Google is an absolute game changer

This new IDE from Google is seriously revolutionary.

How I Scaled a Go Backend to Handle 1 Million Requests per Second

From 100 Requests to 1 Million: My Journey in Scaling a Go Backend