### 1. Error in median()

There seems to be an error in the median() function where the number of numbers of which the median is to be found is 2.

The median of 7,8,9 is 8.

The median of 7,9 is also 8.

The Euphoria function, part of the stats include, returns 7. It returns 9 if you use it to get the median of 9,7.

It is clear if you look at the code, which includes:

```	if length(data_set) < 3 then
return data_set
end if
```

It would be better to return the average (or mean) of the dataset if the length is 2.

### 2. Re: Error in median()

To find the median, the data should be arranged in order from least to greatest. If there is an even number of items in the data set, then the median is found by taking the mean (average) of the two middlemost numbers.

### 3. Re: Error in median()

Quite so.

I've reverted to my own function, which I wrote way back before the stats package was available. To use this, I changed median() in the code I'm writing to Median(). Simple enough.

```--****************************************************************
--**********         global function Median       ****************
--****************************************************************
global function Median (object oInput)
-- Input: A sequence of atoms
-- Output: The median of those values
sequence sTemp
integer	 iIndex

if atom(oInput) then
return oInput
end if

if length(oInput) = 1 then
return oInput
end if

sTemp = sort(oInput)
if IsOdd(length(sTemp)) then
iIndex = floor(length(sTemp) / 2) + 1
return sTemp[iIndex]
else
iIndex = length(sTemp) / 2
return  (sTemp[iIndex] + sTemp[iIndex + 1]) / 2
end if

end function
```

### 4. Re: Error in median()

```       if IsOdd(length(sTemp)) then-- IsOdd IsNot found
```

### 5. Re: Error in median()

CraigWelch said...

Quite so.

I've reverted to my own function, which I wrote way back before the stats package was available. To use this, I changed median() in the code I'm writing to Median(). Simple enough.

....

Hi Craig,

No doubt your use case differs to mine but there will be similarities. Recently I needed to use a weighted median function. This was for my graphics library to do basic image denoising. Maybe I didn't look hard enough but there didn't seem to be any simple but efficient algorithm on the internet. A naive way would be to take the list of each color channel and replicate each element based on its weight and then take the center element after sorting [you could do this as a quick 'n dirty way of weighting your median function]. But using this method, even for small kernels, the processing times soon become prohibitive. The largest kernel size I could go (using standard sorting) was a measly 7x7 pixels. This was sorting 4K elements per pixel. In a multi mega-pixel image that is not a sensible option. Fortunately, I can take advantage of counting sort using a 256 length sequence of accumulators for each of the 3 color channels. A single pass over the N x N kernel, adding the weights to each accumulator, followed by a linear search for the mid entry of each channel accumulator yields a very fast, but synthetic, median RGB value.

For highest quality I also added a function to select the centermost pixel in 3d space. There is no natural way to directly sort RGB pixels. The algorithm I use computes the accumulated distances from each pixel to all other pixels. Lowest accumulated value is deemed to be the median pixel. Unlike the other algorithm a real pixel value is actually returned. I use this code (in Orac) which is reminiscent of Bubble Sort in its n^2 time complexity. Although it's slow the quality seems to be the best. The trick to the weighting is to multiply the other pixel (see p1, p2 below) with the weights associated to a pixel:

```global function weighted_median_rgb(seq s, seq wgts)

-- get the median rgb value from the list of pixels
-- this method is not efficient..
-- but unlike *synth() it picks a value rather than compute one ..

-- the method is to compute the sum of euclidean distance between all points
-- lowest overall score wins

-- init
seq d = repeat(0, ~s) -- distances

-- loop
for i = 1 to ~s-1 do
int p1 = s.i
int w1 = wgts.i
for j = i+1 to ~s do
int p2 = s.j
int w2 = wgts.j
atom dist = ColourDiff2(p1, p2)
d.i += dist * w2 -- see how they are opposites here..
d.j += dist * w1
end for
end for

-- find minimum
int n = 1
atom min = d.1
for i = 2 to ~d do
if d.i < d.n then n = i end if
end for

-- exit
return s.n

end function
```

Spock