OpenEuphoria: Forum: Re: Optimizing basic operations

Re: Optimizing basic operations

new topic » goto parent » topic index » view thread » older message » newer message

Posted by DerekParnell (admin) Aug 22, 2013
4415 views

Thanks again for doing an analysis of the code. Greatly appreciated. Attempting to justify one's code is a healthy thing to do as it get's the brain thinking and (hopefully) helps improve and document the code itself.

gimlet said...

The point of using a heap is that it give O log N insert and extract min/max. It is the basis of heapsort.

Well that may be your requirements but it wasn't mine. My requirement was to give quick [O(1)] access to the top priority record at anytime - nothing more than that. The type of heap you seem to be describing is a binary heap, which is certainly a basis for heapsort. As I'm sure you are aware, there are many types of heaps, each with their own benefits and drawbacks. The thing in common with them all is that the topmost parent node is accessible in O(1) time. All other constraints depend on the application's requirements, and would thus dictate the specific heap implementation.

gimlet said...

There are faults in your method:

1 Your method allows a wildly unbalanced tree. This is not the description of a heap. Whether your heap would balance out in the long run - maybe - probably not.

I have a different opinion about whether or not an unbalanced tree can also be a heap. Given my requirements, it certainly can.

gimlet said...

2 Inserting from the top does not build a correct priority queue as it will produce last in first out for any given priority. I don't think we want to model queue-jumping.

But in my code it does! You may have missed the fact that the structure I use contains a pointer(index) to the next top priority node, which is updated when ever the priority node is superseded by a newly inserted or removed node.

gimlet said...

3 Testing with random key data will generate a balanced tree - as such it is not a fair test.

Why? Fair against which criteria? What if the real-world data being recorded is actually random?

gimlet said...

4 When you do extract min/max you promote one of its children - this may further unbalance your tree.

Yes it may, but it doesn't affect my structure's goal of providing O(1) access to the top priority node.

gimlet said...

The (standard) array approach is clean and simple. Is there some pressing reason that requires you use pointers? Your code is quite complex.

I'm sorry for the complexity. However, the API is simple to use and that will be the most that nearly anyone will see of the code.

By using pointers (indexes) I achieve at least two things - fast insertion and removal due to not having to move around array items, and stability due to each recorded added to the heap retains its index value (record-id) regardless of how many other records are added or removed.

My first attempt used an array and new records were inserted at the correct position and removed records were physically deleted from the array. I found that using the array implementation was about three times slower - updating a few pointers is faster than shuffling array items. Also, by using an array, the API user could never cache a record's position for later reference and instead would have to do a search prior to fetching a record from the heap.

new topic » goto parent » topic index » view thread » older message » newer message

OpenEuphoria

Re: Optimizing basic operations

Search

Include:

Quick Links

User menu

Misc Menu