--- title: "Importing a database" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Importing a database from Nominatim} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup, include=FALSE} library(photon) ``` As specified in the [introduction vignette](photon.html), you can download pre-built search indices for selected country extracts. If you require more freedom in providing the geocoding data, you can choose to import from an existing Nominatim database or from a JSON dump. This vignette guides you through the setup and import of an external database. # Importing from Nominatim Technically, Nominatim databases can only be reliably set up on Linux systems. Here, we use the `mediagis/nominatim` docker image to set up Nominatim irrespective of the operating system. You can use the helper functions `cmd_options()` and `run()` to run a Nominatim docker. It is important to expose the port 5432 on the host machine, otherwise photon is not able to connect to the database. ```{r, eval=FALSE} opts <- cmd_options( e = "PBF_URL=https://download.geofabrik.de/australia-oceania/samoa-latest.osm.pbf", e = "NOMINATIM_PASSWORD=mypassword", e = "FREEZE=true", p = "8080:8080", p = "5432:5432", name = "nominatim", "mediagis/nominatim:4.4", use_double_hyphens = TRUE ) # Note: on Windows, make sure you have Docker Desktop running! nominatim <- process$new("docker", c("run", opts)) # Wait until Nominatim is ready ready <- FALSE while (!ready) { Sys.sleep(5) logs <- run("docker", c("logs", "nominatim")) ready <- any(grepl("ready to accept requests", logs)) } run( "docker", c( "exec", "--user", "postgres", "nominatim", "psql", "-d", "nominatim", "-c", "ALTER USER nominatim WITH ENCRYPTED PASSWORD 'mypassword'" ) ) ``` To verify that the database can be connected to, you can connect to it from R. ```{r, eval=FALSE} library(RPostgres) db <- dbConnect(Postgres(), password = "MNdtC2*pP#aMbe", user = "nominatim") dbGetInfo(db) #> $dbname #> [1] "nominatim" #> #> $host #> [1] "localhost" #> #> $port #> [1] "5432" #> #> $username #> [1] "nominatim" #> #> $protocol.version #> [1] 3 #> #> $server.version #> [1] 140013 #> #> $db.version #> [1] 140013 #> #> $pid #> [1] 604 dbDisconnect(db) ``` If the database can be connected to, you can start a new photon instance and import the database using `$import()`. The database import creates the folder `photon_data` inside the given photon directory. ```{r, eval=FALSE} dir <- file.path(tempdir(), "photon") photon <- new_photon(dir, overwrite = TRUE) #> ℹ java version "22" 2024-03-19 #> ℹ Java(TM) SE Runtime Environment (build 22+36-2370) #> ℹ Java HotSpot(TM) 64-Bit Server VM (build 22+36-2370, mixed mode, sharing) #> ✔ Successfully downloaded photon 1.0.0. [8.2s] #> ℹ No search index downloaded! Download one or import from a Nominatim database. #> • Version: 1.0.0 photon$import(host = "localhost", password = "MNdtC2*pP#aMbe") ``` After the import has finished, you can start the photon instance. ```{r, eval=FALSE} photon$start() #> 2024-10-24 23:26:46,360 [main] WARN org.elasticsearch.node.Node - version [5.6.16-SNAPSHOT] is a pre-release version of Elasticsearch and is not suitable for production #> ✔ Photon is now running. [11.1s] ``` ```{r, eval=FALSE} geocode("Apia", limit = 3) #> Simple feature collection with 3 features and 13 fields #> Geometry type: POINT #> Dimension: XY #> Bounding box: xmin: -171.7631 ymin: -13.83613 xmax: -171.7512 ymax: -13.82611 #> Geodetic CRS: WGS 84 #> # A tibble: 3 × 14 #> idx osm_type osm_id country osm_key city street countrycode osm_value name state type extent #> #> 1 1 W 1322127938 Samoa place NA NA WS city Apia Tuam… city #> 2 1 W 723300892 Samoa landuse Matautu Tai NA WS harbour Apia… Tuam… other #> 3 1 W 666117780 Samoa tourism Levili Levili St… WS attracti… Apia… Tuam… house #> # ℹ 1 more variable: geometry ``` # Import from a JSON dump Since photon 0.7.0, databases can be dumped to and imported from JSON files (so called Nominatim Dump Files, see the [docs](https://github.com/komoot/photon/blob/master/docs/json-dump-format-0.1.0.md)). While pre-built databases are not available for every region through `$download_data()`, JSON dumps are. You can choose to download JSON dumps instead of pre-built databases by setting `json = TRUE`. ```{r, eval=FALSE} photon$remove_data() photon$download_data("Andorra", json = TRUE) ``` Using this data, you can then simply import the dump using the `$import()` method with `json = TRUE`. ```{r, eval=FALSE} photon$import(json = TRUE) ```