Python 3 unclarity: Should an Int trait allow numpy ints?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Python 3 unclarity: Should an Int trait allow numpy ints?

Burnpanck
When porting traits to Python 3, some tests fail because numpy.int64
stops to be a subtype of python's int.
This has the effect that for example "np.array(1)" is not an acceptable
value for an Int trait anymore.
What do you think, is this correct strict behaviour, or should in view
of the int/long unification every representation of an int be considered
an Int for the purposes of traits?
_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev
Reply | Threaded
Open this post in threaded view
|

Re: Python 3 unclarity: Should an Int trait allow numpy ints?

Pietro Berkes
This seems to be a genuine traits bug: the CList.validate method calls the Int.validate method to check the elements of the list. The Int.validate behavior is different from the one of Int.fast_validate, which is usually called, in that it does not take numpy types into consideration.

For example, in Python 2:

import numpy
from traits.api import *

class A(HasTraits):
    b = CList(Int)
    c = Int

a = A()
a.c = numpy.int32(1)  # succeeds
a.c = numpy.int64(1)  # succeeds
a.b = numpy.array([1,2,3], dtype=numpy.int32)  # succeeds
a.b = numpy.array([1,2,3], dtype=numpy.int64)  # fails

The reason it comes up in this context is that in Python 2 the default type of array([1,2,3]) is int32, and I think in Python 3 if int64.



On Sat, Apr 27, 2013 at 2:10 PM, Burnpanck <[hidden email]> wrote:
When porting traits to Python 3, some tests fail because numpy.int64
stops to be a subtype of python's int.
This has the effect that for example "np.array(1)" is not an acceptable
value for an Int trait anymore.
What do you think, is this correct strict behaviour, or should in view
of the int/long unification every representation of an int be considered
an Int for the purposes of traits?
_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev



--
Pietro Berkes
Scientific software developer
Enthought UK


_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev
Reply | Threaded
Open this post in threaded view
|

Re: Python 3 unclarity: Should an Int trait allow numpy ints?

Burnpanck
Actually, on my machine, under python 2.7.4, the picture is slightly different. It is the numpy.int64 that succeeds instead of the numpy.int32. Also, I get consistent behaviour both with Int and CList(Int) (i.e. one numpy.intXX failing, one succeeding). To me, this is not a bug at all. I guess that, depending if it is a 32 bit or 64 bit build, one of the two numpy types actually derives from the python 2.7 int. So it is ok for a strict implementation of an "Int" trait to only accept (sub-) types of the python int.

I think it is rather a design-question, on how Int, CInt, Long and CLong are supposed to work in the Python 3 future. Consistent with Python 3's naming, I would propose to consider the Long's deprecated. The question is, shall the new "Int" be true to the letter and only accept python "int", or should an exception be made for numpy's integer types, in which case I would allow all of them.

The numpy's integers are almost functionally equivalent with the Python 3 int, except that the numpy ints will overflow at the 64 bit boundary. The numpy array scalars are automatically promoted to 64 bit if they overflow any boundary of the smaller types, but there is no arbitrary precision numpy array scalar. Since the Int trait is not doing any casting, and assuming that the intent of an Int trait is to guarantee python int conforming behaviour, strict handling would actually be the right way.

As a side note, my python3 branch on burnpanck/traits now passes Travis CI under Python 2.6, 2.7, 3.2 and 3.3, but by skipping the PyProtocols tests under Python 3.x.


On 27.04.2013 17:41, Pietro Berkes wrote:
This seems to be a genuine traits bug: the CList.validate method calls the Int.validate method to check the elements of the list. The Int.validate behavior is different from the one of Int.fast_validate, which is usually called, in that it does not take numpy types into consideration.

For example, in Python 2:

import numpy
from traits.api import *

class A(HasTraits):
    b = CList(Int)
    c = Int

a = A()
a.c = numpy.int32(1)  # succeeds
a.c = numpy.int64(1)  # succeeds
a.b = numpy.array([1,2,3], dtype=numpy.int32)  # succeeds
a.b = numpy.array([1,2,3], dtype=numpy.int64)  # fails

The reason it comes up in this context is that in Python 2 the default type of array([1,2,3]) is int32, and I think in Python 3 if int64.



On Sat, Apr 27, 2013 at 2:10 PM, Burnpanck <[hidden email]> wrote:
When porting traits to Python 3, some tests fail because numpy.int64
stops to be a subtype of python's int.
This has the effect that for example "np.array(1)" is not an acceptable
value for an Int trait anymore.
What do you think, is this correct strict behaviour, or should in view
of the int/long unification every representation of an int be considered
an Int for the purposes of traits?
_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev



--
Pietro Berkes
Scientific software developer
Enthought UK



_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev


_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev
Reply | Threaded
Open this post in threaded view
|

Re: Python 3 unclarity: Should an Int trait allow numpy ints?

Pietro Berkes



On Sun, Apr 28, 2013 at 12:13 PM, Burnpanck <[hidden email]> wrote:
Actually, on my machine, under python 2.7.4, the picture is slightly different. It is the numpy.int64 that succeeds instead of the numpy.int32. Also, I get consistent behaviour both with Int and CList(Int) (i.e. one numpy.intXX failing, one succeeding). To me, this is not a bug at all. I guess that, depending if it is a 32 bit or 64 bit build, one of the two numpy types actually derives from the python 2.7 int. So it is ok for a strict implementation of an "Int" trait to only accept (sub-) types of the python int.

You're right, I wasn't thinking about Python being 32 or 64 bits. That's the reason why in your case you see the int64 succeeding while I see the int32 succeed.

However, in the code trait_types code the intention is to accept all numpy integers, irrespective of the byte size (https://github.com/enthought/traits/blob/master/traits/trait_types.py#L62). The quirk is that you need make sure that numpy is imported **before** traits is imported. I suspect that if you first import numpy, then traits, the example that I shared will work for 'Int' but not for 'CList(Int)', could you please confirm this?


I think it is rather a design-question, on how Int, CInt, Long and CLong are supposed to work in the Python 3 future. Consistent with Python 3's naming, I would propose to consider the Long's deprecated. The question is, shall the new "Int" be true to the letter and only accept python "int", or should an exception be made for numpy's integer types, in which case I would allow all of them.

I agree with you: in the long term, Long and CLong will be deprecated. In the (potentially very long) transition phase were both Python 2 and 3 should be supported, I think it's ok if Long and CLong are equivalent to Int and CInt for Python 3.
 

The numpy's integers are almost functionally equivalent with the Python 3 int, except that the numpy ints will overflow at the 64 bit boundary. The numpy array scalars are automatically promoted to 64 bit if they overflow any boundary of the smaller types, but there is no arbitrary precision numpy array scalar. Since the Int trait is not doing any casting, and assuming that the intent of an Int trait is to guarantee python int conforming behaviour, strict handling would actually be the right way.

As a side note, my python3 branch on burnpanck/traits now passes Travis CI under Python 2.6, 2.7, 3.2 and 3.3, but by skipping the PyProtocols tests under Python 3.x.

That's awesome! We need to review your branch soon.

Best,
Pietro

 



On 27.04.2013 17:41, Pietro Berkes wrote:
This seems to be a genuine traits bug: the CList.validate method calls the Int.validate method to check the elements of the list. The Int.validate behavior is different from the one of Int.fast_validate, which is usually called, in that it does not take numpy types into consideration.

For example, in Python 2:

import numpy
from traits.api import *

class A(HasTraits):
    b = CList(Int)
    c = Int

a = A()
a.c = numpy.int32(1)  # succeeds
a.c = numpy.int64(1)  # succeeds
a.b = numpy.array([1,2,3], dtype=numpy.int32)  # succeeds
a.b = numpy.array([1,2,3], dtype=numpy.int64)  # fails

The reason it comes up in this context is that in Python 2 the default type of array([1,2,3]) is int32, and I think in Python 3 if int64.



On Sat, Apr 27, 2013 at 2:10 PM, Burnpanck <[hidden email]> wrote:
When porting traits to Python 3, some tests fail because numpy.int64
stops to be a subtype of python's int.
This has the effect that for example "np.array(1)" is not an acceptable
value for an Int trait anymore.
What do you think, is this correct strict behaviour, or should in view
of the int/long unification every representation of an int be considered
an Int for the purposes of traits?
_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev



--
Pietro Berkes
Scientific software developer
Enthought UK



_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev


_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev




--
Pietro Berkes
Scientific software developer
Enthought UK


_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev
Reply | Threaded
Open this post in threaded view
|

Re: Python 3 unclarity: Should an Int trait allow numpy ints?

Pietro Berkes



On Sun, Apr 28, 2013 at 6:30 PM, Pietro Berkes <[hidden email]> wrote:



On Sun, Apr 28, 2013 at 12:13 PM, Burnpanck <[hidden email]> wrote:
Actually, on my machine, under python 2.7.4, the picture is slightly different. It is the numpy.int64 that succeeds instead of the numpy.int32. Also, I get consistent behaviour both with Int and CList(Int) (i.e. one numpy.intXX failing, one succeeding). To me, this is not a bug at all. I guess that, depending if it is a 32 bit or 64 bit build, one of the two numpy types actually derives from the python 2.7 int. So it is ok for a strict implementation of an "Int" trait to only accept (sub-) types of the python int.

You're right, I wasn't thinking about Python being 32 or 64 bits. That's the reason why in your case you see the int64 succeeding while I see the int32 succeed.

However, in the code trait_types code the intention is to accept all numpy integers, irrespective of the byte size (https://github.com/enthought/traits/blob/master/traits/trait_types.py#L62). The quirk is that you need make sure that numpy is imported **before** traits is imported. I suspect that if you first import numpy, then traits, the example that I shared will work for 'Int' but not for 'CList(Int)', could you please confirm this?


I realized that the lines I linked to are not very meaningful without an explanation, or digging in the C code:

    from numpy import integer, floating, complexfloating, bool_

    int_fast_validate = ( 11, int, integer )

The tuple is used in the C code; the first number is the C validator to use; the types following that are all the types to be considered valid for the corresponding trait. In this case, a Python 'int' or a numpy 'integer' (i.e., any numpy integer type, independent from byte size) are considered valid for the 'Int' trait.

The tuple

   long_fast_validate    = ( 11, long, None, int, integer )

means "C validator #11, Python 'long's are accepted, 'int's and numpy 'integer's are casted using the 'long' constructor".


 

I think it is rather a design-question, on how Int, CInt, Long and CLong are supposed to work in the Python 3 future. Consistent with Python 3's naming, I would propose to consider the Long's deprecated. The question is, shall the new "Int" be true to the letter and only accept python "int", or should an exception be made for numpy's integer types, in which case I would allow all of them.

I agree with you: in the long term, Long and CLong will be deprecated. In the (potentially very long) transition phase were both Python 2 and 3 should be supported, I think it's ok if Long and CLong are equivalent to Int and CInt for Python 3.
 

The numpy's integers are almost functionally equivalent with the Python 3 int, except that the numpy ints will overflow at the 64 bit boundary. The numpy array scalars are automatically promoted to 64 bit if they overflow any boundary of the smaller types, but there is no arbitrary precision numpy array scalar. Since the Int trait is not doing any casting, and assuming that the intent of an Int trait is to guarantee python int conforming behaviour, strict handling would actually be the right way.

As a side note, my python3 branch on burnpanck/traits now passes Travis CI under Python 2.6, 2.7, 3.2 and 3.3, but by skipping the PyProtocols tests under Python 3.x.

That's awesome! We need to review your branch soon.

Best,
Pietro

 



On 27.04.2013 17:41, Pietro Berkes wrote:
This seems to be a genuine traits bug: the CList.validate method calls the Int.validate method to check the elements of the list. The Int.validate behavior is different from the one of Int.fast_validate, which is usually called, in that it does not take numpy types into consideration.

For example, in Python 2:

import numpy
from traits.api import *

class A(HasTraits):
    b = CList(Int)
    c = Int

a = A()
a.c = numpy.int32(1)  # succeeds
a.c = numpy.int64(1)  # succeeds
a.b = numpy.array([1,2,3], dtype=numpy.int32)  # succeeds
a.b = numpy.array([1,2,3], dtype=numpy.int64)  # fails

The reason it comes up in this context is that in Python 2 the default type of array([1,2,3]) is int32, and I think in Python 3 if int64.



On Sat, Apr 27, 2013 at 2:10 PM, Burnpanck <[hidden email]> wrote:
When porting traits to Python 3, some tests fail because numpy.int64
stops to be a subtype of python's int.
This has the effect that for example "np.array(1)" is not an acceptable
value for an Int trait anymore.
What do you think, is this correct strict behaviour, or should in view
of the int/long unification every representation of an int be considered
an Int for the purposes of traits?
_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev



--
Pietro Berkes
Scientific software developer
Enthought UK



_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev


_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev




--
Pietro Berkes
Scientific software developer
Enthought UK




--
Pietro Berkes
Scientific software developer
Enthought UK


_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev
Reply | Threaded
Open this post in threaded view
|

Re: Python 3 unclarity: Should an Int trait allow numpy ints?

Corran Webster
Related to the float-handling discussion we had earlier, how much of this goes away if we use (or at least add to the list of types) appropriate ABCs from the numbers module.  Under Python 3, if you do

isinstance(numpy.int16(1), numbers.Integral)

does it evaluate to True?  It doesn't with Python 2.7, but it probably should, and that seems to be an issue with NumPy not registering its types with the numbers ABCs.  And looking over the numpy source code, it doesn't seem to do that currently.

In any case, a reasonable thing to do (if it is fast), is to deprecate Long and CLong, and have the numeric trait type validators use the numbers module ABCs as the prototype (ie. x is an Int if isinstance(x, numbers.Integral) is true).  This still doesn't solve the numpy problem in the short-term, so we'd still need special-casing code for numpy, but if we can get the numpy codebase to register its numeric types with the numbers module then the import order issue will go away with the next numpy release.  I don't know how amenable the numpy folks are to this change in numpy - and it may have been considered and then rejected.

The one concern I have is whether the numbers.Integral codebase is fast enough for the fast validate methods.

-- Corran



On Sun, Apr 28, 2013 at 1:49 PM, Pietro Berkes <[hidden email]> wrote:



On Sun, Apr 28, 2013 at 6:30 PM, Pietro Berkes <[hidden email]> wrote:



On Sun, Apr 28, 2013 at 12:13 PM, Burnpanck <[hidden email]> wrote:
Actually, on my machine, under python 2.7.4, the picture is slightly different. It is the numpy.int64 that succeeds instead of the numpy.int32. Also, I get consistent behaviour both with Int and CList(Int) (i.e. one numpy.intXX failing, one succeeding). To me, this is not a bug at all. I guess that, depending if it is a 32 bit or 64 bit build, one of the two numpy types actually derives from the python 2.7 int. So it is ok for a strict implementation of an "Int" trait to only accept (sub-) types of the python int.

You're right, I wasn't thinking about Python being 32 or 64 bits. That's the reason why in your case you see the int64 succeeding while I see the int32 succeed.

However, in the code trait_types code the intention is to accept all numpy integers, irrespective of the byte size (https://github.com/enthought/traits/blob/master/traits/trait_types.py#L62). The quirk is that you need make sure that numpy is imported **before** traits is imported. I suspect that if you first import numpy, then traits, the example that I shared will work for 'Int' but not for 'CList(Int)', could you please confirm this?


I realized that the lines I linked to are not very meaningful without an explanation, or digging in the C code:

    from numpy import integer, floating, complexfloating, bool_

    int_fast_validate = ( 11, int, integer )

The tuple is used in the C code; the first number is the C validator to use; the types following that are all the types to be considered valid for the corresponding trait. In this case, a Python 'int' or a numpy 'integer' (i.e., any numpy integer type, independent from byte size) are considered valid for the 'Int' trait.

The tuple

   long_fast_validate    = ( 11, long, None, int, integer )

means "C validator #11, Python 'long's are accepted, 'int's and numpy 'integer's are casted using the 'long' constructor".


 

I think it is rather a design-question, on how Int, CInt, Long and CLong are supposed to work in the Python 3 future. Consistent with Python 3's naming, I would propose to consider the Long's deprecated. The question is, shall the new "Int" be true to the letter and only accept python "int", or should an exception be made for numpy's integer types, in which case I would allow all of them.

I agree with you: in the long term, Long and CLong will be deprecated. In the (potentially very long) transition phase were both Python 2 and 3 should be supported, I think it's ok if Long and CLong are equivalent to Int and CInt for Python 3.
 

The numpy's integers are almost functionally equivalent with the Python 3 int, except that the numpy ints will overflow at the 64 bit boundary. The numpy array scalars are automatically promoted to 64 bit if they overflow any boundary of the smaller types, but there is no arbitrary precision numpy array scalar. Since the Int trait is not doing any casting, and assuming that the intent of an Int trait is to guarantee python int conforming behaviour, strict handling would actually be the right way.

As a side note, my python3 branch on burnpanck/traits now passes Travis CI under Python 2.6, 2.7, 3.2 and 3.3, but by skipping the PyProtocols tests under Python 3.x.

That's awesome! We need to review your branch soon.

Best,
Pietro

 



On <a href="tel:27.04.2013%2017" value="+12704201317" target="_blank">27.04.2013 17:41, Pietro Berkes wrote:
This seems to be a genuine traits bug: the CList.validate method calls the Int.validate method to check the elements of the list. The Int.validate behavior is different from the one of Int.fast_validate, which is usually called, in that it does not take numpy types into consideration.

For example, in Python 2:

import numpy
from traits.api import *

class A(HasTraits):
    b = CList(Int)
    c = Int

a = A()
a.c = numpy.int32(1)  # succeeds
a.c = numpy.int64(1)  # succeeds
a.b = numpy.array([1,2,3], dtype=numpy.int32)  # succeeds
a.b = numpy.array([1,2,3], dtype=numpy.int64)  # fails

The reason it comes up in this context is that in Python 2 the default type of array([1,2,3]) is int32, and I think in Python 3 if int64.



On Sat, Apr 27, 2013 at 2:10 PM, Burnpanck <[hidden email]> wrote:
When porting traits to Python 3, some tests fail because numpy.int64
stops to be a subtype of python's int.
This has the effect that for example "np.array(1)" is not an acceptable
value for an Int trait anymore.
What do you think, is this correct strict behaviour, or should in view
of the int/long unification every representation of an int be considered
an Int for the purposes of traits?
_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev



--
Pietro Berkes
Scientific software developer
Enthought UK



_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev


_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev




--
Pietro Berkes
Scientific software developer
Enthought UK




--
Pietro Berkes
Scientific software developer
Enthought UK


_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev



_______________________________________________
Enthought-Dev mailing list
[hidden email]
https://mail.enthought.com/mailman/listinfo/enthought-dev