Burroughs Prototype

Table of Contents

Burroughs Prototype

Quick Start

Requirements

Docker
Maven

1. Confluent Platform Setup

The first thing you'll need to do is start up the Confluent platform containers (Zookeeper, Kafka, KsqlDB, etc.) and the PostgreSQL database, all of which are contianed in the Confluent directory. You'll also need to build the custom Kafka Connect image which contains custom Burroughs single message transforms.

cd SingleMessageTransforms
./buil-connect.sh
cd ../Confluent
docker-compose up -d

Give it a couple seconds and then make sure that everything is running by doing docker-compose ps. If anything has exited, just start it again.

2. Running Burroughs

First, you will need to build Burroughs from source. To do so, run the following (from the root directory).

./build.sh

To run Burroughs on Linux or MacOS run this:

./run.sh

To run Burroughs on Windows run this:

./run-windows.bat

If you setup everything successfully, you should see something like the following.

You can use .help to see a list of useful commands that are available.

3. Test Data

To actually make use of Burroughs, you'll want some data to play with. This repository contains a demo CSV file and default producer to accomplish this. To run it, execute the following.

.producer transactions_producer start 10000

The above command will produce 10000 records to a Kafka topic for testing. You may specify any limit you like up to the 100,000 records that are in the file or no limit at all to dump everything.

4. Executing a simple query

First, we will need to set a name for our output table:

.table test

Once that is done, enter the following:

select storer, sum(spend) from transactions group by storer

If everything works correctly, the output should look like this:

To see your data, open a second terminal and enter the following:

docker exec -it postgres psql -U postgres burroughs

You should now be able to view your results:

When you're done, don't forget to run .stop to clean up all of the stream processing infrastructure. .quit or Ctrl+D exits the burroughs shell.

5. Reading commands from a file

Burroughs can read commands from a file, simply by using the .file command. Commands in the file can either be sql or burroughs commands. All files should be put in the /commands folder.

.file <filename> <delimiter>

Example file:

.producer transactions_producer start 10000; .producer customers_producer start 10000; .table test; select t.basketnum, count(c.CustId) from transactions as t left join customers as c on c.basketnum = t.basketnum group by t.basketnum;

Example command:

.file input.txt ;

Running the Burroughs Browser Interface

Burroughs also ships with a browser-based graphical user interface. To use this, do the following:

System Configuration

Environment Variables

Connections between Burroughs and the other components of the stream processing pipeline, Kafka, KsqlDB, and PostgreSQL, are defined by a handful of environment variables. Not all of these variables are required, as many of them have reasonable defaults. If you use Burroughs with the packaged version of the Confluent Platform as described in the previous section, you shouldn't need to change anything. The below table provides a complete list.

Variable	Description
KSQL_HOST	The hostname and port number for the ksqldb server.
DB_HOST	The hostname and port for the PostgreSQL database.
DB_USER	The database user to provide when connecting.
DB_PASSWORD	The database password
DATABASE	The database to use. The default is burroughs.
KAFKA_HOST	The hostname and port of the Kafka Broker to use.
CONNECTOR_DB	The hostname and port for the PostgreSQL to provide to the ksqlDB sink connectors. This will likely be the same as DB_HOST.
SCHEMA_REGISTRY	The URL of the AVRO schema registry to use. This is only necessary if you plan on using the embedded producer utility.

Producers

Burroughs is not designed to be used as a data source, but sometimes it is convenient to be able to quickly produce some data for testing. Producers can be created and configured in the producers.json file in the producer directory, which expects an array of producer objects. For example, the default producer that ships with Burroughs looks like this:

  {
    "name": "transactions_producer",
    "topic": "transactions",
    "delay": 0,
    "schema": "transaction.avsc",
	  "key_field": "StoreR", 
    "data_source": {
      "type": "file",
      "source": {
        "location": "datafiles/transactions.csv",
        "header": true,
        "delimiter": ","
      }
    }
  }

The below tables describe all of these properties and their uses. All paths are relative to the producer folder.

Property	Description	Type	Required	Default
name	Used to reference the producer when executing commands	String	Yes	None
topic	The topic to produce the records onto	String	Yes	None
schema	The path to the AVRO schema file for this data source	String	Yes	None
delay	The artificial delay to insert between messages in milliseconds	Integer	No	0
key_field	The field to use as the message key. Must be defined in the schema file. If none is specified the message value will be used as the key	String	No	None
data_source	The data source to pull records from	Object	Yes	None

The data source object must have a type field which can be either "file" or "database". It must also have a source object whose fields are defined below.

Property	Description	Type	Data source	Default
location	Path to the data file	String	File	None
delimiter	Specifies the delimiter	String	File	,
header	Whether or not the file contains a header line which must be skipped	Boolean	File	false
hostname	Database hostname	String	Database	Burroughs database host
database	Database name	String	Database	Burroughs database name
username	Database user	String	Database	Burroughs database user
password	Database password	String	Database	Burroughs database password

Producers can be started using the command:

.producer <name> start [record limit]

To see the other producer operations refer to the .help documentation.

Burroughs/Burroughs

About

Resources

Stars

Watchers

Forks

Releases

Contributors 9

Languages