Commit f014a924 authored by Richard Fairhurst's avatar Richard Fairhurst
Browse files

First commit

parents
Configuring Tilemaker
=====================
Vector tiles contain (generally thematic) 'layers'. For example, your tiles might contain river, cycleway and railway layers.
You'll generally assign OSM data into layers by making decisions based on their tags. You might put anything with a `highway=` tag into the roads layer, anything with a `railway=` tag into the railway layer, and so on.
In Tilemaker, you achieve this by writing a short script in the Lua programming language. Lua is a simple and fast language used by several other OpenStreetMap tools, such as the OSRM routing engine and osm2pgsql.
In addition, you supply Tilemaker with a JSON file which specifies certain global settings for your tileset.
A note on zoom levels
---------------------
Because vector tiles are so efficiently encoded, you generally don't need to create tiles above (say) zoom level 14. Instead, your renderer will use the data in the z14 tiles to generate z15, z16 etc. (This is called 'overzooming'.)
So when you set a maximum zoom level of 14 in Tilemaker, this doesn't mean you're restricted to displaying maps at z14. It just means that Tilemaker will create z14 tiles, and it's your renderer's job to use these tiles to draw the most detailed maps.
JSON configuration
------------------
The JSON config file sets out the layers you'll be using, and which zoom levels they apply to. For example, you might want to include your roads layer in your z12-z14 tiles, but your buildings at z14 only.
It also includes these global settings:
* `minzoom` - the minimum zoom level at which any tiles will be generated
* `maxzoom` - the maximum zoom level at which any tiles will be generated
* `basezoom` - the zoom level for which Tilemaker will generate tiles internally (should usually be the same as `maxzoom`)
* `include_ids` - whether you want to store the OpenStreetMap IDs for each way/node within your vector tiles
* `name`, `version` and `description` - about your project (these are written into the MBTiles file)
A typical config file would look like this:
{
"layers": {
"roads": { "minzoom": 12, "maxzoom": 14 },
"buildings": { "minzoom": 14, "maxzoom": 14 },
"pois": { "minzoom": 13, "maxzoom": 14 }
},
"settings": {
"minzoom": 12,
"maxzoom": 14,
"basezoom": 14,
"include_ids": false,
"name": "Tilemaker example",
"version": "0.1",
"description": "Sample vector tiles for Tilemaker"
}
}
The order of layers will be carried forward into the vector tile.
Note that all options are compulsory. If Tilemaker baulks at the JSON file, check everything's included, and run it through an online JSON validator to check for syntax errors.
By default Tilemaker expects to find this file at config.json, but you can specify another filename with the `--config` command-line option.
Lua processing
--------------
Your Lua file needs to supply three things:
1. `node_keys`, a list of those OSM keys which indicate that a node should be processed
2. `node_function`, a function to process an OSM node and add it to layers
3. `way_function`, a function to process an OSM way and add it to layers
`node_keys` is a simple list (or in Lua parlance, a 'table') of OSM tag keys. If a node has one of those keys, it will be processed by `node_function`; if not, it'll be skipped. For example, if you wanted to show highway crossings and railway stations, it should be `{ "highway", "railway" }`. (This avoids the need to process the vast majority of nodes which contain no important tags at all.)
`node_function` and `way_function` work the same way. They are called with an OSM object; you then inspect the tags of that object, and put it in your vector tiles' layers based on those tags. In essence, the process is:
* look at tags
* if tags meet criteria, write to a layer
* (optionally) add attributes (= vector tile metadata/tags)
To do that, you use these methods:
* `node:Find(key)` or `way:Find(key)`: get the value for a tag, or the empty string if not present. For example, `way:Find("railway")` might return "rail" for a railway, "siding" for a siding, or "" if it isn't a railway at all.
* `node:Holds(key)` or `way:Holds(key)`: returns true if that key exists, false otherwise.
* `node:Id()` or `way:Id()`: get the OSM ID of the current object.
* `node:Layer("layer_name", false)` or `way:Layer("layer_name", is_area)`: write this node/way to the named layer. This is how you put objects in your vector tile. is_area (true/false) specifies whether a way should be treated as an area, or just as a linestring.
* `node:Attribute(key,value)` or `node:Attribute(key,value)`: add an attribute to the most recently written layer.
* `node:AttributeNumeric(key,value)`, `node:AttributeBoolean(key,value)` (and `way:`...): for numeric/boolean columns.
The simplest possible function, to include roads/paths and nothing else, might look like this:
function way_function(way)
local highway = way:Find("highway")
if highway~="" then
way:Layer("roads", false)
way:Attribute("name", way:Find("name"))
way:Attribute("type", highway)
end
end
Take a look at the supplied process.lua for a full example. You can specify another filename with the `--process` option.
If your Lua file causes an error due to mistaken syntax, you can test it at the command line with `luac -p filename`. Three frequent Lua gotchas: tables (arrays) start at 1, not 0; the "not equal" operator is `~=` (that's the other way round from Perl/Ruby's regex operator); and `if` statements always need a `then`, even when written over several lines.
Relations
---------
Tilemaker handles multipolygon relations natively. The combined geometries are processed as ways (i.e. by `way_function`), so if your function puts buildings in a 'buildings' layer, Tilemaker will cope with this whether the building is mapped as a simple way or a multipolygon. The only difference is that they're given an artificial ID.
Multipolygons are expected to have tags on the relation, not the outer way. The vector tile spec is [slightly vague](https://github.com/mapbox/vector-tile-spec/issues/30) on multipolygon encoding. Tilemaker will enforce correct winding order, but in the case of a multipolygon with multiple outer ways, it assigns all inner ways to the first outer way.
FTWPL - THE FUCK THE WARRANTY PUBLIC LICENCE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
You just DO WHAT THE FUCK YOU WANT TO.
WARRANTY
There is NO FUCKING WARRANTY.
CXXFLAGS := -O3 -Wall -Wno-unknown-pragmas -Wno-sign-compare -std=c++11
LIB := -L/usr/local/lib -lz -llua5.1 -lboost_program_options -lluabind -lsqlite3 -lboost_filesystem -lboost_system -lprotobuf
INC := -I/usr/local/include -I./include -I./src -I/usr/local/include/lua5.1 -I/usr/include/lua5.1
all:
protoc --proto_path=include --cpp_out=include include/osmformat.proto include/vector_tile.proto
g++ $(CXXFLAGS) -o tilemaker include/osmformat.pb.cc include/vector_tile.pb.cc src/tilemaker.cpp $(INC) $(LIB)
install:
install -m 0755 tilemaker /usr/local/bin
clean:
rm tilemaker
.PHONY: install
Tilemaker
=========
Tilemaker creates vector tiles (in Mapbox Vector Tile format) from an .osm.pbf planet extract, as typically downloaded from providers like Geofabrik. It aims to be 'stack-free': you need no database and there is only one executable to install.
Vector tiles are used by many in-browser/app renderers, and can also power server-side raster rendering. They enable on-the-fly style changes and greater interactivity, while imposing less of a storage burden. You can output them to individual files, or to a SQLite (.mbtiles) database.
Tilemaker keeps nodes and ways in RAM. If you're processing a country extract or larger, you'll need a lot of RAM. It's best suited to city and region extracts.
Installing
----------
Tilemaker is written in C++11. The chief dependencies are:
* Google Protocol Buffers
* Boost 1.57
* Lua 5.1 and Luabind
* sqlite3
rapidjson (MIT) and sqlite_modern_cpp (MIT) are bundled in the include/ directory.
On OS X, you can install all dependencies with Homebrew. On Ubuntu, start with:
apt-get liblua5.1-0 libprotobuf-dev libsqlite3-dev protobuf-c-compiler
You'll then need to install libboost1.57-all-dev from [this PPA](https://launchpad.net/~afrank/+archive/ubuntu/boost):
add-apt-repository ppa:afrank/boost
apt-get update
apt-get install libboost1.57-all-dev
Finally, we need to install luabind manually because the Ubuntu package (sigh) requires Boost 1.54, whereas we need 1.57. So:
git clone https://github.com/rpavlik/luabind.git
cd luabind
# The following line might not be necessary for you,
# but I needed it to make sure that liblua was in /usr/lib:
ln -s /usr/lib/x86_64-linux-gnu/liblua5.1.so /usr/lib/
bjam install
ln -s /usr/local/lib/libluabindd.so /usr/local/lib/libluabind.so
ldconfig
Once you've installed those, then `cd` back to your Tilemaker directory and simply:
make
make install
If it fails, check that the LIB and INC lines in the Makefile correspond with your system, then try again.
Configuring
-----------
Vector tiles contain (generally thematic) 'layers'. For example, your tiles might contain river, cycleway and railway layers. It's up to you what OSM data goes into each layer. You configure this in Tilemaker with two files:
* a JSON file listing each layer, and the zoom levels at which to apply it
* a Lua program that looks at each node/way's tags, and places it into layers accordingly
You can read more about these in [CONFIGURATION.md](CONFIGURATION.md). Sample files are provided to work out-of-the-box.
Running
-------
At its simplest, you can create a set of vector tiles from a .pbf with this command:
tilemaker liechtenstein-latest.osm.pbf --output=tiles/
Output can be as individual files to a directory, or to an MBTiles file aka a SQLite database (with extension .mbtiles or .sqlite).
You may load multiple .pbf files in one run (for example, adjoining counties). Tilemaker does not clear the existing contents of MBTiles files, which makes it easy to load two cities into one file. This does mean you should delete any existing file if you want a fresh run.
The JSON configuration and Lua processing files are specified with --config and --process respectively. Defaults are config.json and process.lua.
You can get a run-down of available options with
tilemaker --help
When running, you may see "couldn't find constituent way" messages. This happens when the .pbf file contains a multipolygon relation, but not all the relation's members are present. Typically, this will happen when a multipolygon crosses the border of the extract - for example, a county boundary formed by a river with islands. In this case, the river will simply not be written to the tiles.
Rendering
---------
That bit's up to you! See https://github.com/mapbox/vector-tile-spec/wiki/Implementations for a list of renderers which support vector tiles.
The [Leaflet.MapboxVectorTile plugin](https://github.com/SpatialServer/Leaflet.MapboxVectorTile) is perhaps the simplest way to test out your new vector tiles.
Contributing
------------
Bug reports, suggestions and (especially!) pull requests are very welcome on the Github issue tracker. Please check the tracker to see if your issue is already known, and be nice. For questions, please use IRC (irc.oftc.net or http://irc.osm.org, channel #osm-dev) and http://help.osm.org.
Formatting: braces and indents as shown, hard tabs (4sp). (Yes, I know.) Please be conservative about adding dependencies.
Copyright and contact
---------------------
Richard Fairhurst, 2015. The tilemaker code is licensed as FTWPL; you may do anything you like with this code and there is no warranty. The included rapidjson (Milo Yip and THL A29) and sqlite_modern_cpp (Amin Roosta) libraries are MIT.
If you'd like to sponsor development of Tilemaker, you can contact me at richard@systemeD.net.
Thank you to the usual suspects for support and advice (you know who you are), to Mapbox for developing vector tiles, and to Dennis Luxen for the introduction to Lua and the impetus to learn C++.
(Looking for a provider to host vector tiles? I recommend Thunderforest: http://thunderforest.com/)
{
"layers": {
"water": { "minzoom": 11, "maxzoom": 14 },
"roads": { "minzoom": 12, "maxzoom": 14 },
"buildings": { "minzoom": 14, "maxzoom": 14 },
"pois": { "minzoom": 13, "maxzoom": 14 }
},
"settings": {
"minzoom": 12,
"maxzoom": 14,
"basezoom": 14,
"include_ids": false,
"name": "Tilemaker example",
"version": "0.1",
"description": "Sample vector tiles for Tilemaker"
}
}
sqlite modern cpp wrapper
====
This library is a lightweight modern wrapper around sqlite C api .
```c++
#include<iostream>
#include "sqlite_modern_cpp.h"
using namespace sqlite;
using namespace std;
try {
// creates a database file 'dbfile.db' if it does not exists.
database db("dbfile.db");
// executes the query and creates a 'user' table
db <<
"create table if not exists user ("
" _id integer primary key autoincrement not null,"
" age int,"
" name text,"
" weight real"
");";
// inserts a new user record.
// binds the fields to '?' .
// note that only types allowed for bindings are :
// int ,long, long long, float, double
// string , u16string
// sqlite3 only supports utf8 and utf16 strings, you should use std::string for utf8 and std::u16string for utf16.
// note that u"my text" is a utf16 string literal of type char16_t * .
db << "insert into user (age,name,weight) values (?,?,?);"
<< 20
<< u"bob" // utf16 string
<< 83.25f;
db << u"insert into user (age,name,weight) values (?,?,?);" // utf16 query string
<< 21
<< "jack"
<< 68.5;
cout << "The new record got assigned id " << db.last_insert_rowid() << endl;
// slects from user table on a condition ( age > 18 ) and executes
// the lambda for each row returned .
db << "select age,name,weight from user where age > ? ;"
<< 18
>> [&](int age, string name, double weight) {
cout << age << ' ' << name << ' ' << weight << endl;
};
// selects the count(*) from user table
// note that you can extract a single culumn single row result only to : int,long,long,float,double,string,u16string
int count = 0;
db << "select count(*) from user" >> count;
cout << "cout : " << count << endl;
// this also works and the returned value will be automatically converted to string
string str_count;
db << "select count(*) from user" >> str_count;
cout << "scount : " << str_count << endl;
}
catch (exception& e) {
cout << e.what() << endl;
}
```
Transactions
=====
You can use transactions with `begin;`, `commit;` and `rollback;` commands.
*(don't forget to put all the semicolons at the end of each query)*.
```c++
db << "begin;"; // begin a transaction ...
db << "insert into user (age,name,weight) values (?,?,?);"
<< 20
<< u"bob"
<< 83.25f;
db << "insert into user (age,name,weight) values (?,?,?);" // utf16 string
<< 21
<< u"jack"
<< 68.5;
db << "commit;"; // commit all the changes.
db << "begin;"; // begin another transaction ....
db << "insert into user (age,name,weight) values (?,?,?);" // utf16 string
<< 19
<< u"chirs"
<< 82.7;
db << "rollback;"; // cancel this transaction ...
```
Dealing with NULL values
=====
If you have databases where some rows may be null, you can use boost::optional to retain the NULL value between C++ variables and the database. Note that you must enable the boost support by including the type extension for it.
```c++
#include "sqlite_modern_cpp.h"
#include "extensions/boost_optional.h"
struct User {
long long _id;
boost::optional<int> age;
boost::optional<string> name;
boost::optional<real> weight;
};
{
User user;
user.name = "bob";
// Same database as above
database db("dbfile.db");
// Here, age and weight will be inserted as NULL in the database.
db << "insert into user (age,name,weight) values (?,?,?);"
<< user.age
<< user.name
<< user.weight;
user._id = db.last_insert_rowid();
}
{
// Here, the User instance will retain the NULL value(s) from the database.
db << "select _id,age,name,weight from user where age > ? ;"
<< 18
>> [&](long long id,
boost::optional<int> age,
boost::optional<string> name
boost::optional<real> weight) {
User user;
user._id = id;
user.age = age;
user.name = move(name);
user.weight = weight;
cout << "id=" << user._id
<< " age = " << (user.age ? to_string(*user.age) ? string("NULL"))
<< " name = " << (user.name ? *user.name : string("NULL"))
<< " weight = " << (user.weight ? to_string(*user.weight) : string(NULL))
<< endl;
};
}
```
*node: for NDK use the full path to your database file : `sqlite::database db("/data/data/com.your.package/dbfile.db")`*.
##License
MIT license - [http://www.opensource.org/licenses/mit-license.php](http://www.opensource.org/licenses/mit-license.php)
option java_package = "crosby.binary";
/* OSM Binary file format
This is the master schema file of the OSM binary file format. This
file is designed to support limited random-access and future
extendability.
A binary OSM file consists of a sequence of FileBlocks (please see
fileformat.proto). The first fileblock contains a serialized instance
of HeaderBlock, followed by a sequence of PrimitiveBlock blocks that
contain the primitives.
Each primitiveblock is designed to be independently parsable. It
contains a string table storing all strings in that block (keys and
values in tags, roles in relations, usernames, etc.) as well as
metadata containing the precision of coordinates or timestamps in that
block.
A primitiveblock contains a sequence of primitive groups, each
containing primitives of the same type (nodes, densenodes, ways,
relations). Coordinates are stored in signed 64-bit integers. Lat&lon
are measured in units <granularity> nanodegrees. The default of
granularity of 100 nanodegrees corresponds to about 1cm on the ground,
and a full lat or lon fits into 32 bits.
Converting an integer to a lattitude or longitude uses the formula:
$OUT = IN * granularity / 10**9$. Many encoding schemes use delta
coding when representing nodes and relations.
*/
/* Added */
message BlobHeader {
required string type = 1;
optional bytes indexdata = 2;
required int32 datasize = 3;
}
message Blob {
optional bytes raw = 1; // No compression
optional int32 raw_size = 2; // Only set when compressed, to the uncompressed size
optional bytes zlib_data = 3;
// optional bytes lzma_data = 4; // PROPOSED.
// optional bytes OBSOLETE_bzip2_data = 5; // Deprecated.
}
//////////////////////////////////////////////////////////////////////////
//////////////////////////////////////////////////////////////////////////
/* Contains the file header. */
message HeaderBlock {
optional HeaderBBox bbox = 1;
/* Additional tags to aid in parsing this dataset */
repeated string required_features = 4;
repeated string optional_features = 5;
optional string writingprogram = 16;
optional string source = 17; // From the bbox field.
}
/** The bounding box field in the OSM header. BBOX, as used in the OSM
header. Units are always in nanodegrees -- they do not obey
granularity rules. */
message HeaderBBox {
required sint64 left = 1;
required sint64 right = 2;
required sint64 top = 3;
required sint64 bottom = 4;
}
///////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////
message PrimitiveBlock {
required StringTable stringtable = 1;
repeated PrimitiveGroup primitivegroup = 2;
// Granularity, units of nanodegrees, used to store coordinates in this block
optional int32 granularity = 17 [default=100];
// Offset value between the output coordinates coordinates and the granularity grid in unites of nanodegrees.
optional int64 lat_offset = 19 [default=0];
optional int64 lon_offset = 20 [default=0];
// Granularity of dates, normally represented in units of milliseconds since the 1970 epoch.
optional int32 date_granularity = 18 [default=1000];
// Proposed extension:
//optional BBox bbox = 19;
}
// Group of OSMPrimitives. All primitives in a group must be the same type.
message PrimitiveGroup {
repeated Node nodes = 1;
optional DenseNodes dense = 2;
repeated Way ways = 3;
repeated Relation relations = 4;
repeated ChangeSet changesets = 5;
}
/** String table, contains the common strings in each block.
Note that we reserve index '0' as a delimiter, so the entry at that
index in the table is ALWAYS blank and unused.
*/
message StringTable {
repeated bytes s = 1;
}
/* Optional metadata that may be included into each primitive. */
message Info {
optional int32 version = 1 [default = -1];
optional int32 timestamp = 2;
optional int64 changeset = 3;
optional int32 uid = 4;
optional int32 user_sid = 5; // String IDs
}
/** Optional metadata that may be included into each primitive. Special dense format used in DenseNodes. */
message DenseInfo {
repeated int32 version = 1 [packed = true];
repeated sint64 timestamp = 2 [packed = true]; // DELTA coded
repeated sint64 changeset = 3 [packed = true]; // DELTA coded
repeated sint32 uid = 4 [packed = true]; // DELTA coded
repeated sint32 user_sid = 5 [packed = true]; // String IDs for usernames. DELTA coded
}
// TODO: REMOVE THIS? NOT in osmosis schema.
message ChangeSet {
required int64 id = 1;
// Parallel arrays.
repeated uint32 keys = 2 [packed = true]; // String IDs.
repeated uint32 vals = 3 [packed = true]; // String IDs.
optional Info info = 4;
required int64 created_at = 8;
optional int64 closetime_delta = 9;
required bool open = 10;
optional HeaderBBox bbox = 11;
}
message Node {
required sint64 id = 1;
// Parallel arrays.
repeated uint32 keys = 2 [packed = true]; // String IDs.
repeated uint32 vals = 3 [packed = true]; // String IDs.
optional Info info = 4; // May be omitted in omitmeta
required sint64 lat = 8;
required sint64 lon = 9;
}
/* Used to densly represent a sequence of nodes that do not have any tags.
We represent these nodes columnwise as five columns: ID's, lats, and
lons, all delta coded. When metadata is not omitted,
We encode keys & vals for all nodes as a single array of integers
containing key-stringid and val-stringid, using a stringid of 0 as a
delimiter between nodes.
( (<keyid> <valid>)* '0' )*
*/
message DenseNodes {
repeated sint64 id = 1 [packed = true]; // DELTA coded
//repeated Info info = 4;
optional DenseInfo denseinfo = 5;
repeated sint64 lat = 8 [packed = true]; // DELTA coded
repeated sint64 lon = 9 [packed = true]; // DELTA coded
// Special packing of keys and vals into one array. May be empty if all nodes in this block are tagless.
repeated int32 keys_vals = 10 [packed = true];
}
message Way {
required int64 id = 1;
// Parallel arrays.
repeated uint32 keys = 2 [packed = true];
repeated uint32 vals = 3 [packed = true];
optional Info info = 4;
repeated sint64 refs = 8 [packed = true]; // DELTA coded
}
message Relation {
enum MemberType {
NODE = 0;
WAY = 1;
RELATION = 2;
}
required int64 id = 1;
// Parallel arrays.
repeated uint32 keys = 2 [packed = true];
repeated uint32 vals = 3 [packed = true];
optional Info info = 4;
// Parallel arrays