Liberty BASIC Community Forum
« finding the min distance between 2 sets of points »

Welcome Guest. Please Login or Register.
Mar 24th, 2017, 4:51pm


Rules|Home|Help|Search|Recent Posts|Notification


« Previous Topic | Next Topic »
Pages: 1  Notify Send Topic Print
 thread  Author  Topic: finding the min distance between 2 sets of points  (Read 327 times)
blessed2b3rd
Junior Member
ImageImage


member is offline

Avatar




PM


Posts: 65
xx finding the min distance between 2 sets of points
« Thread started on: Sep 24th, 2015, 10:42am »

Hello again,

Firstly thank you for all the very helpful solutions you have all provided. Its been nine months since my coding conception and although the curve has been steep its been an oddly pleasing balance of frustration and immense satisfaction.

I have a new issue. I have two sets of points in two separate .csv files. Lets call them set A and set B. Both sets of points are defined by XYZ coordinates. Say set A has 1000 points and set B has 50000. What I want to do is cycle through each point in set A and find which point in set B is closest.

Im figuring the logic is this:

Take the first point in set A and using the X value, cycle through every X value in set B. By minus-ing each value in set B from each value in set A I can then figure which value is smallest, thus being closest on the X plane. Doing the same for each Y value will do the same for the Y plane. Adding the distances of the X value and Y value for each point will give me an overall value and thus filtering the smallest will tell me which points are closest. I am hoping to end up with a third .csv file listing the compete set A with the closest 1000 set B points next to them.

To speed up the code I use only points that are within the range of set A so as not to cycle through ach and every 50000 set B points.

I realize this may be difficult to visualize so i'll attempt a analogy. Picture a cargo net hanging from its four corners from four posts. The cargo net has 50000 thread cross-over points where there is 1000 tennis balls hanging and tied via a piece of string to a thread cross-over at randomly chosen spots. I want to find which cross over point is closest to each tennis ball.

I'm probably making it worse but here is the logic code I've come up with so far but I'm getting stuck in the loops. Can anyone help?

Set A = CSVData
Set B = CSVData1


Code:
'************************************************************************
'Scan Co-ords for max and min values
'************************************************************************

Xmin = 100000000000
Xmax = -100000000000
Ymin = 100000000000
Ymax = -100000000000

for i = 1 to countdata

X = CSVData(i,nX)
Y = CSVData(i,nY)

    if CSVData(i,nX) < Xmin then Xmin = X
    if CSVData(i,nX) > Xmax then Xmax = X
    if CSVData(i,nY) < Ymin then Ymin = Y
    if CSVData(i,nY) > Ymax then Ymax = Y
next i

print Xmin;" "; Xmax;" "; Ymin;" "; Ymax

PXmin = 100000000000
PXmax = -100000000000
PYmin = 100000000000
PYmax = -100000000000

for i = 1 to countdata

PX = CSVData1(i,nPitX)
PY = CSVData1(i,nPitY)

    if CSVData1(i,nPitX) < PXmin then PXmin = PX
    if CSVData1(i,nPitX) > PXmax then PXmax = PX
    if CSVData1(i,nPitY) < PYmin then PYmin = PY
    if CSVData1(i,nPitY) > PYmax then PYmax = PY
next i

print PXmin;" "; PXmax;" "; PYmin;" "; PYmax

'************************************************************************
'Cycle through points
'************************************************************************

MindistX = 999999999
MindistY = 999999999

for i = 1 to countdata
    for j = 1 to countdata

DX = CSVData1(i,nMinDistX)
DY = CSVData1(i,nMinDistY)

        if CSVData(i,nX) > PXmin and CSVData(i,nX) < PXmax then
            distX = CSVData(i,nX) - CSVData1(j,nPitX)
        end if
            if CSVData(i,ny) > Pymin and CSVData(i,nY) < PYmax then
                distY = CSVData(i,nY) - CSVData1(j,nPitY)
            end if
                if distY < MindistY then MindistY = CSVData1(i,nMinDistY)
                end if
        if distX < MindistX then MindistX = CSVData1(i,nMinDistX)
        end if
    next j
next i
 


I can provide the entire code where I import all the points and provide the two sets of points but the files are large and you guys seem to come up with a clickety-click method to ensure the code works on a smaller set first.
User IP Logged

tenochtitlanuk
Board Moderator

member is offline

Avatar




Homepage PM

Gender: Male
Posts: 1137
xx Re: finding the min distance between 2 sets of poi
« Reply #1 on: Sep 24th, 2015, 4:18pm »

The following MAY be the kind of thing you're wanting.
I ASSUME the z values are irrelevant.
The data is held in a CSV string for convenience and compatibility with files.

The graphic representation shows the two overlapping data sets.
NB I'm using much smaller sets so i can see results in a few seconds. Haven't checked how long it takes for sets of 1000 and 50000!! ( ?about one day??)

I generate the two datasets internally- the 'randomize' means I get the same data each time for test purposes.
I use pythagoras to find separations ( actually use sepn^2 to save time.) then re-plot.
The new file appends nearest neighbour data to each of the first dataset's terms. Easily saved to a file.

Is this the kind of thing you are after??

Code:
'   nomainwin

randomize( 0.666)  '   seed PRNGenerator

WindowWidth  =600
WindowHeight =600

'                                       Generate two data sets to play with....
s1 =100 '1000
s2 =300 '50000

dim dataSet1$( s1)    '   string holds x y and z as csv string
dim dataSet2$( s2)
dim dataSet3$( s1)    '   will add the closest from set2 to set1

open "Display" for graphics_nsb_nf as #wg

#wg "trapclose [quit]"
#wg "down ; fill 50 50 50 ; size 2"
'                                       Show them on screen....
#wg "color red"
for i =1 to s1
    x =50 +int( 400 *rnd( 1)): y =50 +int( 400 *rnd( 1)): z =int( 256 *rnd( 1))
    dataSet1$( i) =str$( x) +"," +str$( y) +"," +str$( z)
    #wg "set "; 50 +x; " ";50 +y
    scan
next i

#wg "color cyan"
for i =1 to s2
    x =int( 500 *rnd( 1)): y =int( 500 *rnd( 1)): z =int( 256 *rnd( 1))
    dataSet2$( i) =str$( x) +"," +str$( y) +"," +str$( z)
    #wg "set "; 50 +x; " "; 50 +y
    scan
next i

'                                                   Go through all data in set1 and find closest in set2
for i =1 to s1
    sepnSqd =1E12
    Jn       =0
    for j =1 to s2
        dX =val( word$( dataSet1$( i), 1, ",")) -val( word$( dataSet2$( j), 1, ","))
        dY =val( word$( dataSet1$( i), 2, ",")) -val( word$( dataSet2$( j), 2, ","))
        dSqd =dX^2 +dY^2
        if dSqd <sepnSqd then sepnSqd =dSqd: Jn =j   '   Jn ends pointing to closest in set 2
    next j
    '                                               Create new dataset3
    dataSet3$( i) =dataSet1$( i) +"," +dataSet2$( Jn)
next i

'                                                   Clear screen and show only set1 and nearest neighbours
#wg "fill darkblue"

for i =1 to s1
    print dataSet3$( i)
    xS =int( val( word$( dataSet3$( i), 1, ",")))
    yS =int( val( word$( dataSet3$( i), 2, ",")))
    #wg "color red  ; set "; 50 +xS; " "; 50 +yS

    xS =int( val( word$( dataSet3$( i), 4, ",")))
    yS =int( val( word$( dataSet3$( i), 5, ",")))
    #wg "color cyan ; set "; 50 +xS; " "; 50 +yS
    scan
next i

wait

[quit]
close #wg
end
 


Sounds like you find programming as addictive and satisfying as I do!
User IP Logged

AndyAmaya
Senior Member
ImageImageImageImage


member is offline

Avatar




PM

Gender: Male
Posts: 261
xx Re: finding the min distance between 2 sets of poi
« Reply #2 on: Sep 24th, 2015, 4:26pm »

I would suggest using the distance formula

Don't take the square root to get the actual distance (it will take too much time). Just leave the values squared and compare squared values.

ex.
Let's say you want to find the squared distance between two 3D points (x1,y1,z1 and x2,y2,z2).

xd = (x2-x1)
yd = (y2-y1)
zd = (y2-z1)

distSquared = xd*xd + yd*yd + zd*zd

Save the distSquared and the record numbers of the A and B files in a comma separated value (.CSV) file.

Load the .csv file into Excel or Libre Office spreadsheet and sort by the distSquared column.

The smallest distances will be at the top and you will see the associated A and B record numbers in the second and third column.

That's the best I can come up with to ease your pain smiley

Good Luck!

Andy
User IP Logged

Knowledge liberates minds. It does not diminish when shared with more minds. Instead, the more it is shared, the more it grows.
tsh73
Board Moderator

member is online

Avatar

Anatoly (real name)


PM

Gender: Male
Posts: 1609
xx Re: finding the min distance between 2 sets of poi
« Reply #3 on: Sep 25th, 2015, 03:56am »

Here is it, simplest way (brute force), finding nearest from 50000 B points, to 10 A points. That is, 1/100 of the actual task.

On my machine, it takes 12 seconts

12sec*100 make 20 hours.
EDIT stupid me. 20 minutes. I sure can wait 20 minutes - just go for some coffee.

Now, I guess that something like C/C# is some 100x times faster so it will be under 1 minute.

So the obvious choise is - recode it in a faster language
or implement some smart algorythm.
Like may be Nearest neighbour search on K-d_tree

Code:
'Say set A has 1000 points and set B has 50000.
'What I want to do is cycle through each point in set A
'and find which point in set B is closest.


numA=1000
numB=50000
dim A.x(numA),A.y(numA),A.z(numA)
dim AtoB(numA)
dim B.x(numB),B.y(numB),B.z(numB)

'generate randomly
print "generate randomly"
randomize 0.5   'repeatable "randomly"
rangeMin = 1000: rangeMax = 10000
for i = 1 to numA
    A.x(i) = rangeMin + rnd(1)*(rangeMax - rangeMin)
    A.y(i) = rangeMin + rnd(1)*(rangeMax - rangeMin)
    A.z(i) = rangeMin + rnd(1)*(rangeMax - rangeMin)
    'print i, A.x(i), A.y(i), A.z(i)
next
for i = 1 to numB
    B.x(i) = rangeMin + rnd(1)*(rangeMax - rangeMin)
    B.y(i) = rangeMin + rnd(1)*(rangeMax - rangeMin)
    B.z(i) = rangeMin + rnd(1)*(rangeMax - rangeMin)
next

print "Finding nearest point "
t0=time$("ms")
for i = 1 to 10'numA
    Ax=A.x(i): Ay=A.y(i): Az=A.z(i)
    minD=1e300
    for j = 1 to numB
        'd=(A.x(i)-B.x(j))^2+(A.y(i)-B.y(j))^2+(A.z(i)-B.z(j))
        'Time taken, s 13.281 for 10x50000
        d=(Ax-B.x(j))^2+(Ay-B.y(j))^2+(Az-B.z(j))
        'Time taken, s 11.781 then cashing Ax Ay Az
        'print j, d, minD, minJ
        if d < minD then minD=d: minJ = j
    next
    AtoB(i)=minJ
next
t1=time$("ms")
print "Time taken, s "; (t1-t0)/1000
print "over"

for i = 1 to 20'numA
    print i, AtoB(i)
next
 
« Last Edit: Sep 25th, 2015, 05:13am by tsh73 » User IP Logged

The existence of bug reports means somebody is using the software and actually cares to report back to you that he is having a problem with it, instead of just deleting it from their hard disk.
(Janusz Marcin Gorycki)
tsh73
Board Moderator

member is online

Avatar

Anatoly (real name)


PM

Gender: Male
Posts: 1609
xx Re: finding the min distance between 2 sets of poi
« Reply #4 on: Sep 25th, 2015, 05:10am »

>>recode it in a faster language
Sometimes one just *need* another language. And to object it is sheer badass stubborness.

Here same code converted to MS VC# Express 2010
Code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            int numA=1000;
            int numB=50000;
            int[] Ax=new int[numA];
            int[] Ay=new int[numA];
            int[] Az=new int[numA];
            int[] AtoB=new int[numA];

            int[] Bx=new int[numB];
            int[] By=new int[numB];
            int[] Bz=new int[numB];
            
            Console.WriteLine("generate randomly");
            Random r = new Random(1);   //'repeatable "randomly"
            int rangeMin = 1000,  rangeMax = 10000;
            for (int i = 0; i < numA; i++)
			{
			    Ax[i]=r.Next(rangeMin,  rangeMax );
			    Ay[i]=r.Next(rangeMin,  rangeMax );
			    Az[i]=r.Next(rangeMin,  rangeMax );
			}
            for (int i = 0; i < numB; i++)
			{
			    Bx[i]=r.Next(rangeMin,  rangeMax );
			    By[i]=r.Next(rangeMin,  rangeMax );
			    Bz[i]=r.Next(rangeMin,  rangeMax );
			}

            Console.WriteLine("Finding nearest point");
            DateTime t0 = DateTime.Now;

            double minD, d;
            int minJ=-1;
            for (int i = 0; i < numA; i++)
			{
                minD=1e300;
			    for (int j = 0; j < numB; j++)
			    {
                    d=(Ax[i]-Bx[j])*(Ax[i]-Bx[j])
                        +(Ay[i]-By[j])*(Ay[i]-By[j])
                        +(Az[i]-Bz[j])*(Az[i]-Bz[j]);
                    if (d < minD)
	                {
		                 minD=d; minJ = j;
	                }
			    }
                AtoB[i]=minJ;
			}

            DateTime t1 = DateTime.Now;

            Console.WriteLine("over");
            Console.WriteLine("Time taken: {0}", t1 - t0);
            for (int i = 0; i < 20; i++)
            {
                Console.WriteLine("{0} {1}", i, AtoB[i]);
            }
            Console.ReadKey();
        }
    }
}
 

results:
Code:
generate randomly
Finding nearest point
over
Time taken: 00:00:02.9687690
0 48171
1 36077
2 45538
3 32153
4 11737
5 32412
6 32864
7 1619
8 8320
9 24538
10 35109
11 31782
12 15209
13 10501
14 40535
15 40947
16 40871
17 12194
18 43724
19 26686
 


That's 3 seconds for *whole* task. 3 seconds compared to 20 hours. Oh boy. I should miscalculate something...
Quick check:
12 seconds * 100 is 1200 seconds is 20 MINUTES!!!
damn.
User IP Logged

The existence of bug reports means somebody is using the software and actually cares to report back to you that he is having a problem with it, instead of just deleting it from their hard disk.
(Janusz Marcin Gorycki)
tenochtitlanuk
Board Moderator

member is offline

Avatar




Homepage PM

Gender: Male
Posts: 1137
xx Re: finding the min distance between 2 sets of poi
« Reply #5 on: Sep 25th, 2015, 06:21am »

Agree about language choice- we say in UK 'Horses for courses', ie in a horse race the particular horse to bet on varies depending on the particular place they are running.
Anyway, here's my prog taking 1min to display 50 x 10000 points.
User Image
EDIT full 1000x50000 took under two hours...
« Last Edit: Sep 25th, 2015, 08:17am by tenochtitlanuk » User IP Logged

blessed2b3rd
Junior Member
ImageImage


member is offline

Avatar




PM


Posts: 65
xx Re: finding the min distance between 2 sets of poi
« Reply #6 on: Oct 15th, 2015, 5:53pm »

Hi All,

Thanks for the replies. Sorry its taken so long to get back to this but a mix of travel for work and the Rugby World Cup things sorta took a sidetrack.

The formula and technique provided by tsh73 seems to be the most straightforward for my novice needs. I can write the code to figure out what is the shortest distance but how do I write the culpable X and Y coordinates to a new csv. The writing to the .csv file isn't the issue its identifying the X and Y values that were used to figure out the minimum distance so I can then write these back to the corresponding set A. Here is my code:


Code:
MinD = 999999999

for i = 1 to countdata
    for j = 1 to countdata

        if CSVData(i,nX) > PXmin and CSVData(i,nX) < PXmax and CSVData(i,ny) > Pymin and CSVData(i,nY) < PYmax then
            d = (CSVData(i,nX) - CSVData1(j,nPitX))^2 + (CSVData(i,ny) - CSVData1(j,nPitY))^2
                if d < MinD then MinD = d
                end if
        end if
    next j
next i
 


For each point in set A there is a MinD. How do I write the XYs from set B that gave me the MinD?
User IP Logged

Rod
Global Moderator
ImageImageImageImageImage


member is offline

Avatar

Graphics = goosebumps!


PM

Gender: Male
Posts: 5215
xx Re: finding the min distance between 2 sets of poi
« Reply #7 on: Oct 16th, 2015, 01:42am »

At the point you decide if minD is to be overwritten then also store minX and minY. Then as you exit the for next loop write minD minX and minY to your csv file.
User IP Logged

Pages: 1  Notify Send Topic Print
« Previous Topic | Next Topic »

Rules|Home|Help|Search|Recent Posts|Notification

Donate $6.99 for 50,000 Ad-Free Pageviews!

| |

This forum powered for FREE by Conforums ©
Sign up for your own Free Message Board today!
Terms of Service | Privacy Policy | Conforums Support | Parental Controls