probability - How can I distinguish between two different users who live near to each other? -
how can distinguish between 2 different users, 2 different neighbours lives in same address , goes same office, have different patterns of driving , have different office schedules. wanted find out probability of 2 persons behaves more or less exactly. depending on resolution of map, wants figure them, are, how are. can create pattern ´for each drivers signatures, identity can traced upon.
i assume, way asked question, haven't had plausible ideas yet. i'll make answer purely based on idea might try out.
i thought of suggesting along line of word-similarity metrics, because order not important here, maybe it's worth trying simpler start. in fact, if ever find myself considering complex when developing model, take step , try simplify. it's quicker code, , don't attached that's dead end.
so, how histograms? if divide time , space larger blocks, can increment value in relevant location each time interval. 2d histogram of person's location. can use basic anti-aliasing make histograms more representative.
from there, it's down histogram comparison. implement real basic using 1d strips. know, sum similarity measure each of vertical , horizontal strips. linear histogram comparison super-easy, , few lines of code in language c. enough proof of concept. if feels you're on right track, start looking more tricky ideas...
the next thing i'd further stratify data, using days of week , statutory holidays... maybe stratify further using seasonal variables. i've found pretty effective forecasting electricity load, social patterns weather. trends become more distinct when separate influencing variable.
so, after stratification stack of 2d 'slices', , signature becomes kind of 3d volume. see nothing wrong representing entire planet grid. whether squares represent 100m or 1km. it's easy store sparsely , prune out that's outside number of standard deviations. might choose major events day , end handful of locations.
you can focus on comparison metric. maybe kind of image-based gradient- or cluster-analysis. i'm sure there's loads of great stuff out there. kinds of starting-points make, having done no research.
if need add temporal information introduce separation between people similar lives, can maybe build lags system... such "where hour ago". @ point (or possibly before), want switch over-simplified approach of averaging out person's daily activities, , instead use classification trees. kind of thing easy , rapid develop tool matlab or r.
Comments
Post a Comment