Euphoria Ticket #768: Deserialization of 4.0.4 information in 4.1.0 results in an out of memory error.

The deserialization process needs better checks and safeguards to prevent bad data in general from crashing the interpreter.

In particular, since the format of serialized data changed between 4.0 and 4.1, using 4.1 to deserialize data that was serialized in 4.0 can cause a crash.

Details

Type: Bug Report Severity: Blocking Category: Interpreter
Assigned To: unknown Status: Fixed Reported Release: 4.1.0
Fixed in SVN #: View VCS: none Milestone: 4.1.0

1. Comment by DerekParnell Jun 06, 2012

It changed? Why was that done?

2. Comment by mattlewis Jun 06, 2012

The serialization format changed to accommodate 8-byte integers and 10 byte floating point numbers.

3. Comment by DerekParnell Jun 06, 2012

I haven't looked at it yet, but I'd assume that the new format would be backward compatible. In other words, 4.1 code could deserialize 4.0 data but not visa versa.

4. Comment by mattlewis Jun 06, 2012

If you can come up with a way to do that, it would be great. We could have some sort of flag, perhaps that would enable us to use either format, but since we already use all 256 values of a byte in the 4.0 format, I don't think there's a way to just make it all work.

5. Comment by DerekParnell Jun 06, 2012

In v4.0, the byte pattern #FF XX #00 #00 #00, where XX is any byte never occurs. This could be exploited in v4.1 to cater for 256 different new data types. For example, #FF #00 #00 #00 #00 could prefix an 8-byte integer and #FF #01 #00 #00 #00 could prefix a 10-byte floating point value.

6. Comment by mattlewis Jun 06, 2012

Awesome!

7. Comment by mattlewis Jun 06, 2012

The next question is how / whether we backport 8-byte integers and 10-byte floating point to 4.0.

The 8-byte integers are very straightforward.

The float80_to_atom() from 4.1 code would probably be the easiest way to get that working in 4.0. MinGW handles this, where Watcom requires ASM, since it doesn't natively handle long doubles. Alternatively, we could punt on these and just return a zero.

8. Comment by mattlewis Jun 06, 2012

The simplest thing is probably to take the following code (from 4.1 be_machine.c), and poke it into memory, then use c_func()s to do the conversions right there in serialize.e:

typedef void ( __cdecl *convert_ptr)(void*,void*); 
convert_ptr convert_80_to_64; 
convert_ptr convert_64_to_80; 
char *code_64_to_80 = "\x55\x89\xe5\x8b\x45\x0c\x8b\x55\x08\xdd\x02\xdb\x38\x5d\xc3\x00"; 
char *code_80_to_64 = "\x55\x89\xe5\x83\xec\x08\x8b\x45\x0c\x8b\x55\x08\xdb\x2a\xdd\x5d\xf8\xdd\x45\xf8\xdd\x18\xc9\xc3\x00"; 
 
/* 

 * The machine code represented in the above strings is equivalent to the following functions 
 * (with a compiler where long doubles are 80-bit floating point numbers): 
 *  
		void convert_64_to_80( void *f64, void *f80 ){ 
			*(long double*)f80 = (long double) *(double*)f64; 
		} 
 
		void convert_80_to_64( void *f80, void *f64 ){ 
			*(double*)f64 = (double) *(long double*)f80; 
		} 
*/ 

9. Comment by mattlewis Jun 07, 2012

See: hg:euphoria/rev/c68fce6dfe2a

changeset: 5586:c68fce6dfe2a branch: 4.0 parent: 5560:cbdc7d2e4930 user: Matt Lewis date: Thu Jun 07 08:10:51 2012 -0400 files: docs/release/4.0.5.txt include/std/serialize.e tests/t_serialize.e description:

  • backport deserialization for 8-byte integers and 10-byte floating point numbers
  • ticket 768

10. Comment by mattlewis Jun 07, 2012

See: hg:euphoria/rev/ade1a8e88a71

changeset: 5587:ade1a8e88a71 parent: 5579:2938e761e722 parent: 5586:c68fce6dfe2a user: Matt Lewis date: Thu Jun 07 11:58:38 2012 -0400 files: bin/ecp.dat docs/release/4.0.5.txt include/std/serialize.e description:

  • merge serialization updates into trunk
  • fixes ticket 768

Search



Quick Links

User menu

Not signed in.

Misc Menu