Objects 72 4. Expressions and Operators

Ruby is a very pure object-oriented language: all values are objects, and there is no distinction between primitive types and object types as there are in many other languages. In Ruby, all objects inherit from a class named Object and share the methods defined by that class. This section explains the common features of all objects in Ruby.

It is dense in parts, but it’s required reading; the information here is fundamental.

3.8.1 Object References

When we work with objects in Ruby, we are really working with object references. It is not the object itself we manipulate but a reference to it.* When we assign a value to a variable, we are not copying an object “into” that variable; we are merely storing a reference to an object into that variable. Some code makes this clear:

s = "Ruby" # Create a String object. Store a reference to it in s.

t = s # Copy the reference to t. s and t both refer to the same object.

t[-1] = "" # Modify the object through the reference in t.

print s # Access the modified object through s. Prints "Rub".

t = "Java" # t now refers to a different object.

print s,t # Prints "RubJava".

* If you are familiar with C or C++, you can think of a reference as a pointer: the address of the object in memory. Ruby does not use pointers, however. References in Ruby are opaque and internal to the implementation. There is no way to take the address of a value, dereference a value, or do pointer arithmetic.

When you pass an object to a method in Ruby, it is an object reference that is passed to the method. It is not the object itself, and it is not a reference to the reference to the object. Another way to say this is that method arguments are passed by value rather than by reference, but that the values passed are object references.

Because object references are passed to methods, methods can use those references to modify the underlying object. These modifications are then visible when the method returns.

3.8.1.1 Immediate values

We’ve said that all values in Ruby are objects and all objects are manipulated by reference. In the reference implementation, however, Fixnum and Symbol objects are actually “immediate values” rather than references. Neither of these classes have mutator methods, so Fixnum and Symbol objects are immutable, which means that there is really no way to tell that they are manipulated by value rather than by reference.

The existence of immediate values should be considered an implementation detail. The only practical difference between immediate values and reference values is that immediate values cannot have singleton methods defined on them. (Singleton methods are explained in §6.1.4.)

3.8.2 Object Lifetime

The built-in Ruby classes described in this chapter have literal syntaxes, and instances of these classes are created simply by including their values literally in your code. Ob- jects of other classes need to be explicitly created, and this is most often done with a method named new:

myObject = myClass.new

new is a method of the Class class. It allocates memory to hold the new object, then it initializes the state of that newly allocated “empty” object by invoking its initialize method. The arguments to new are passed directly on to initialize. Most classes define an initialize method to perform whatever initialization is necessary for instances.

The new and initialize methods provide the default technique for creating new classes, but classes may also define other methods, known as “factory methods,” that return instances. We’ll learn more about new, initialize, and factory methods in §7.4.

Ruby objects never need to be explicitly deallocated, as they do in languages like C and C++. Ruby uses a technique called garbage collection to automatically destroy objects that are no longer needed. An object becomes a candidate for garbage collection when it is unreachable—when there are no remaining references to the object except from other unreachable objects.

The fact that Ruby uses garbage collection means that Ruby programs are less suscep- tible to memory leaks than programs written in languages that require objects and

3.8 Objects | 73

memory to be explicitly deallocated and freed. But garbage collection does not mean that memory leaks are impossible: any code that creates long-lived references to objects that would otherwise be short-lived can be a source of memory leaks. Consider a hash used as a cache. If the cache is not pruned using some kind of least-recently-used algorithm, then cached objects will remain reachable as long as the hash itself is reachable. If the hash is referenced through a global variable, then it will be reachable as long as the Ruby interpreter is running.

3.8.3 Object Identity

Every object has an object identifier, a Fixnum, that you can obtain with the object_id method. The value returned by this method is constant and unique for the lifetime of the object. While the object is accessible, it will always have the same ID, and no other object will share that ID.

The method id is a deprecated synonym for object_id. Ruby 1.8 issues a warning if you use it, and it has been removed in Ruby 1.9.

__id__ is a valid synonym for object_id. It exists as a fallback, so you can access an object’s ID even if the object_id method has been undefined or overridden.

The Object class implements the hash method to simply return an object’s ID.

3.8.4 Object Class and Object Type

There are several ways to determine the class of an object in Ruby. The simplest is simply to ask for it:

o = "test" # This is a value

o.class # Returns an object representing the String class

If you are interested in the class hierarchy of an object, you can ask any class what its superclass is:

o.class # String: o is a String object

o.class.superclass # Object: superclass of String is Object o.class.superclass.superclass # nil: Object has no superclass

In Ruby 1.9, Object is no longer the true root of the class hierarchy:

# Ruby 1.9 only

Object.superclass # BasicObject: Object has a superclass in 1.9 BasicObject.superclass # nil: BasicObject has no superclass

See §7.3 for more on BasicObject.

So a particularly straightforward way to check the class of an object is by direct comparison:

o.class == String # true if o is a String

The instance_of? method does the same thing and is a little more elegant:

o.instance_of? String # true if o is a String

Usually when we test the class of an object, we would also like to know if the object is an instance of any subclass of that class. To test this, use the is_a? method, or its synonym kind_of?:

x = 1 # This is the value we're working with x.instance_of? Fixnum # true: is an instance of Fixnum

x.instance_of? Numeric # false: instance_of? doesn't check inheritance x.is_a? Fixnum # true: x is a Fixnum

x.is_a? Integer # true: x is an Integer x.is_a? Numeric # true: x is a Numeric

x.is_a? Comparable # true: works with mixin modules, too x.is_a? Object # true for any value of x

The Class class defines the === operator in such a way that it can be used in place of is_a?:

Numeric === x # true: x is_a Numeric

This idiom is unique to Ruby and is probably less readable than using the more traditional is_a? method.

Every object has a well-defined class in Ruby, and that class never changes during the lifetime of the object. An object’s type, on the other hand, is more fluid. The type of an object is related to its class, but the class is only part of an object’s type. When we talk about the type of an object, we mean the set of behaviors that characterize the object.

Another way to put it is that the type of an object is the set of methods it can respond to. (This definition becomes recursive because it is not just the name of the methods that matter, but also the types of arguments that those methods can accept.)

In Ruby programming, we often don’t care about the class of an object, we just want to know whether we can invoke some method on it. Consider, for example, the <<

operator. Arrays, strings, files, and other I/O-related classes define this as an append operator. If we are writing a method that produces textual output, we might write it generically to use this operator. Then our method can be invoked with any argument that implements <<. We don’t care about the class of the argument, just that we can append to it. We can test for this with the respond_to? method:

o.respond_to? :"<<" # true if o has an << operator

The shortcoming of this approach is that it only checks the name of a method, not the arguments for that method. For example, Fixnum and Bignum implement << as a left-shift operator and expect the argument to be a number instead of a string. Integer objects appear to be “appendable” when we use a respond_to? test, but they produce an error when our code appends a string. There is no general solution to this problem, but an ad-hoc remedy, in this case, is to explicitly rule out Numeric objects with the is_a?

method:

o.respond_to? :"<<" and not o.is_a? Numeric

3.8 Objects | 75

Another example of the type-versus-class distinction is the StringIO class (from Ruby’s standard library). StringIO enables reading from and writing to string objects as if they were IO objects. StringIO mimics the IO API—StringIO objects define the same methods that IO objects do. But StringIO is not a subclass of IO. If you write a method that expects a stream argument, and test the class of the argument with is_a? IO, then your method won’t work with StringIO arguments.

Focusing on types rather than classes leads to a programming style known in Ruby as

“duck typing.” We’ll see duck typing examples in Chapter 7.

3.8.5 Object Equality

Ruby has a surprising number of ways to compare objects for equality, and it is important to understand how they work, so you know when to use each method.

3.8.5.1 The equal? method

The equal? method is defined by Object to test whether two values refer to exactly the same object. For any two distinct objects, this method always returns false:

a = "Ruby" # One reference to one String object b = c = "Ruby" # Two references to another String object a.equal?(b) # false: a and b are different objects b.equal?(c) # true: b and c refer to the same object

By convention, subclasses never override the equal? method.

Another way to determine if two objects are, in fact, the same object is to check their object_id:

a.object_id == b.object_id # Works like a.equal?(b)

3.8.5.2 The == operator

The == operator is the most common way to test for equality. In the Object class, it is simply a synonym for equal?, and it tests whether two object references are identical.

Most classes redefine this operator to allow distinct instances to be tested for equality:

a = "Ruby" # One String object

b = "Ruby" # A different String object with the same content a.equal?(b) # false: a and b do not refer to the same object a == b # true: but these two distinct objects have equal values

Note that the single equals sign in this code is the assignment operator. It takes two equals signs to test for equality in Ruby (this is a convention that Ruby shares with many other programming languages).

Most standard Ruby classes define the == operator to implement a reasonable definition of equality. This includes the Array and Hash classes. Two arrays are equal according to == if they have the same number of elements, and if their corresponding elements are all equal according to ==. Two hashes are == if they contain the same number of

key/value pairs, and if the keys and values are themselves equal. (Values are compared with the == operator, but hash keys are compared with the eql? method, described later in this chapter.)

Equality for Java Programmers

If you are a Java programmer, you are used to using the == operator to test if two objects are the same object, and you are used to using the equals method to test whether two distinct objects have the same value. Ruby’s convention is just about the opposite of Java’s.

The Numeric classes perform simple type conversions in their == operators, so that (for example) the Fixnum1 and the Float1.0 compare as equal. The == operator of classes, such as String and Array, normally requires both operands to be of the same class. If the righthand operand defines a to_str or to_ary conversion function (see §3.8.7), then these operators invoke the == operator defined by the righthand operand, and let that object decide whether it is equal to the lefthand string or array. Thus, it is possible (though not common) to define classes with string-like or array-like comparison behavior.

!= (“not-equal”) is used in Ruby to test for inequality. When Ruby sees !=, it simply uses the == operator and then inverts the result. This means that a class only needs to define the == operator to define its own notion of equality. Ruby gives you the != operator for free. In Ruby 1.9, however, classes can explicitly define their own !=

operators.

3.8.5.3 The eql? method

The eql? method is defined by Object as a synonym for equal?. Classes that override it typically use it as a strict version of == that does no type conversion. For example:

1 == 1.0 # true: Fixnum and Float objects can be ==

1.eql?(1.0) # false: but they are never eql!

The Hash class uses eql? to check whether two hash keys are equal. If two objects are eql?, their hash methods must also return the same value. Typically, if you create a class and define the == operator, you can simply write a hash method and define eql? to use

==.

3.8.5.4 The === operator

The === operator is commonly called the “case equality” operator and is used to test whether the target value of a case statement matches any of the when clauses of that statement. (The case statement is a multiway branch and is explained in Chapter 5.) Object defines a default === operator so that it invokes the == operator. For many classes, therefore, case equality is the same as == equality. But certain key classes define ===

3.8 Objects | 77

differently, and in these cases it is more of a membership or matching operator. Range defines === to test whether a value falls within the range. Regexp defines === to test whether a string matches the regular expression. And Class defines === to test whether an object is an instance of that class. In Ruby 1.9, Symbol defines === to return true if the righthand operand is the same symbol as the left or if it is a string holding the same text. Examples:

(1..10) === 5 # true: 5 is in the range 1..10

/\d+/ === "123" # true: the string matches the regular expression String === "s" # true: "s" is an instance of the class String :s === "s" # true in Ruby 1.9

It is uncommon to see the === operator used explicitly like this. More commonly, its use is simply implicit in a case statement.

3.8.5.5 The =~ operator

The =~ operator is defined by String and Regexp (and Symbol in Ruby 1.9) to perform pattern matching, and it isn’t really an equality operator at all. But it does have an equals sign in it, so it is listed here for completeness. Object defines a no-op version of =~ that always returns false. You can define this operator in your own class, if that class defines some kind of pattern-matching operation or has a notion of approximate equality, for example. !~ is defined as the inverse of =~. It is definable in Ruby 1.9 but not in Ruby 1.8.

3.8.6 Object Order

Practically every class can define a useful == method for testing its instances for equality.

Some classes can also define an ordering. That is: for any two instances of such a class, the two instances must be equal, or one instance must be “less than” the other. Num- bers are the most obvious classes for which such an ordering is defined. Strings are also ordered, according to the numeric ordering of the character codes that comprise the strings. (With the ASCII text, this is a rough kind of case-sensitive alphabetical order.) If a class defines an ordering, then instances of the class can be compared and sorted.

In Ruby, classes define an ordering by implementing the <=> operator. This operator should return –1 if its left operand is less than its right operand, 0 if the two operands are equal, and 1 if the left operand is greater than the right operand. If the two operands cannot be meaningfully compared (if the right operand is of a different class, for example), then the operator should return nil:

1 <=> 5 # -1 5 <=> 5 # 0 9 <=> 5 # 1

"1" <=> 5 # nil: integers and strings are not comparable

The <=> operator is all that is needed to compare values. But it isn’t particularly intuitive.

So classes that define this operator typically also include the Comparable module as a

mixin. (Modules and mixins are covered in §7.5.2.) The Comparable mixin defines the following operators in terms of <=>:

< Less than

<= Less than or equal

== Equal

>= Greater than or equal

> Greater than

Comparable does not define the != operator; Ruby automatically defines that operator as the negation of the == operator. In addition to these comparison operators, Comparable also defines a useful comparison method named between?:

1.between?(0,10) # true: 0 <= 1 <= 10

If the <=> operator returns nil, all the comparison operators derived from it return false. The special Float value NaN is an example:

nan = 0.0/0.0; # zero divided by zero is not-a-number nan < 0 # false: it is not less than zero nan > 0 # false: it is not greater than zero nan == 0 # false: it is not equal to zero nan == nan # false: it is not even equal to itself!

nan.equal?(nan) # this is true, of course

Note that defining <=> and including the Comparable module defines a == operator for your class. Some classes define their own == operator, typically when they can implement this more efficiently than an equality test based on <=>. It is possible to define classes that implement different notions of equality in their == and <=> operators. A class might do case-sensitive string comparisons for the == operator, for example, but then do case-insensitive comparisons for <=>, so that instances of the class would sort more naturally. In general, though, it is best if <=> returns 0 if and only if == returns true.

3.8.7 Object Conversion

Many Ruby classes define methods that return a representation of the object as a value of a different class. The to_s method, for obtaining a String representation of an object, is probably the most commonly implemented and best known of these methods. The subsections that follow describe various categories of conversions.

3.8.7.1 Explicit conversions

Classes define explicit conversion methods for use by application code that needs to convert a value to another representation. The most common methods in this category are to_s, to_i, to_f, and to_a to convert to String, Integer, Float, and Array, respectively. Ruby 1.9 adds to_c and to_r methods to convert to Complex and Rational.

3.8 Objects | 79

Download from Wow! eBook <www.wowebook.com>

Objects 72 4. Expressions and Operators

A Sudoku Solver in Ruby 17 2. The Structure and Execution of Ruby Programs