Amph DB Model
This is part of the online publication "XYZ". We take a branch of the CEIPAC amphora database and turn it into a CSV exchange format. The aim of the website is to provide a working example of the visual analysis of places according to a profile of stamped/typed amphora. The analysis is not a statistical analysis, that provides a p-valued inference path, but it is a explorative analysis. With the help of geometrical notions the places will be related to each other in terms of their amphora findings in the database. We aim for a graphical inquiry to accompany hypothesis about historic trade paths. If we are able to show a potential that the data is a good basis for the inference on trade paths, we will, in the last step, provide a statistical p-valued inference, if not we will discuss the reason.

General algorithm to build this website:
1. Take the database as CSV and transform it into a key value database and RAM representation.
2. Count stamps/types over all and per place.
3. Build a profile by the over all occurrence of stamps/types, or by selection of stamps/types.
4. Select places that should be compared depending on the amount of stamped/typed amphora or by selecting them.
5. Compute the relative frequencies of stamped/typed amphora per place and fill the profile of each place.
6. Compute a vector distance (for example euclidean distance) between the profiles of the selected places and fill distance matrix.
7. Calculate an (tSNE) embedding of the profiles of the places to picture their relationship as a 2D representation and draw the result as SVG.
8. Calculate a hierarchical clustering (average distance of clusters) and draw the result as SVG.
9. Draw a simple map (Mercator projection) and mark the places (LONG, LAT), than connect (colored line) the places within one cluster of the hierarchical clustering.

Usage of website:
The first section lets you configure and see the data from the database and selections that were made. Every configuration is stored in your Browser, that means if you delete the browser local storage (cache), than the configuration is gone. If you change the configuration / selection, than you need to reload the website or hit the "Recalculate" button to see the results.
If you find a place name in the website / visualizations, you may click on it and the profile / distance relation will show up, that describes the place in the computation. In the list of selected places you find the place names color encoded. The red font color indicates sparse profiles (many zero valued counts of stamps/types).
The map shows connections for the different layers of hierarchical clustering. The visualization starts with the first layer of the hierarchical clustering, showing the connections between places that strongly relate to each other according to the profile. By clicking on the button "show clusters" you can switch through the cluster layers. You also can select the cluster layer by using the buttons labeled "CLx".
Every visualization is rendered as SVG, you can export the graphics by clicking on the "Export SVG" button. If you build groups, for example of places, you have to select the group to be part of the computation and reload the page.

TOC

1. Configuration / Data
2. Selected data for computation
3. Amphora clustering by stamp/type profiles
4. Map projection of places and distance matrix of amphora findings
5. Error and Inference
6. References

1. Configuration / Data

Database

(Show raw database.)
(Show database statistics.)
(Choose a CSV file of the database to upload the data. Use § as separator for the CSV data (choose in Excel(?)/OpenOffice/LibreOffice)! Use unique file naming for different versions. If file version (i.e. name) already exits, than no update of database is done.)
(Choose the version of the CSV data you want to work with.)

Places

(Minimum count of stamped/typed ampora (per place) to select place. Ignored, if places are selected manually.)
(Manual place selection.)
(Join the count of places together to have a proxy for a "region" or something else.)

Stamps / Types

(Choose to have the profile by stamps or by types.)
(Manual stamp unification, to join stamps that refer to the same manufacturer.)
(Minimum over all count of stamped/typed ampora to be part of profile.)
(Manual stamp/type selection to build profile. Other selection is ignored.)


Computation



(Download the labeled distance matrix as csv file and use it with Gephi or Excel.)
(Select the distance measure to build the distance matrix.)
(Select the 2D embedding to be displayed.)


(...)

2. Selected data for computation

3. Amphora clustering by stamp/type profiles


4. Map projection of places and distance matrix of amphora findings