Symbols and Strings

I’ve been thinking about Symbols today.

In Ruby 1.4, they were basically fancy Fixnums. They were immutable and immediate,
and their values didn’t really matter. (Back then, as I recall, you could actually do a puts
of a symbol and get a numeric result. I don’t recall that there was a Symbol class, though
there may have been.)

Symbols are still immutable and immediate (though we don’t mentally associate them with
integers any more). Their semantics has changed little, except that in recent versions of
Ruby I believe they are stored on the heap and thus can be garbage-collected (to avoid an
exploit where an app might create an arbitrary number of symbols and force an out-of-memory

But it occurs to me now that much of their use is stringlike. I do find myself converting back and
forth between symbols and strings fairly often (especially to strings — not so often the reverse).

Consider also mental “nearness” of strings and symbols. This has led many people to use the
Rails function with_indifferent_access — a practice I won’t support or decry here.

So I have a germ of an idea. A knowledgeable person may be able to shoot it down quickly —
the likes of Matz himself, Dave Thomas, and a dozen others whom I consider giants. Others,
despite lacking demigod status, may have useful points to make, or may have strong
opinions.  😉

The idea is simply: Let Symbols be nothing more or less than immutable Strings. In fact, let
Symbol inherit from String.

No more with_indifferent_access. No excessive to_s and to_sym scattered everywhere. No
more asking: Does this return a Symbol or a String?

What do you think?


Open classes: Kids, don’t try this at home

I’ve always believed that Ruby had open classes for a reason. People rail about
the consequences of misusing this feature; but my response is: Then don’t
misuse it.

Some things in life, such as spoons, are difficult to use in a dangerous way. Others,
such as automobiles and free speech, can be very dangerous. The degree of caution
must be appropriate to the risk; but that is not to say such things should never be

If I’m writing code strictly for my own use, especially in a self-contained one-off script,
I reopen classes as I see fit.

For example – the other day I was writing something with RMagick, and I found myself
wanting to manipulate geometric points (and especially to represent constants simply)
without any hassle. So I did this:

class Array
  def x
    raise "Not a point" unless size == 2
  def x=(val)
    raise "Not a point" unless size == 2
    self[0] = val
  def y
    raise "Not a point" unless size == 2
  def y=(val)
    raise "Not a point" unless size == 2
    self[1] = val
  def mid(other)
    raise "Not a point" unless size == 2
    raise "Other is not a point" unless other.size == 2
    [(self.x+other.x)/2.0, (self.y+other.y)/2.0]
  def distance(other)
    raise "Not a point" unless size == 2
    raise "Other is not a point" unless other.size == 2
    dx = other.x - self.x
    dy = other.y - self.y
    Math.sqrt(dx**2 + dy**2)

This may or may not appeal to you. It enables me to initialize points very
simply, e.g.  a = [3,5]  and still access a.x and a.y as needed.

It’s far from bulletproof. An array of strings, for example, will be considered
to be a “point” up to the time we do an illegal operation and the code blows
up. But in this context, for my purposes, it felt just right.

Encodings in Ruby 1.9

Lately I’ve been studying all the details of how encodings work. I think on the average,
things work better than before. Of course, if you’ve always been a “plain ASCII in the
USA” type of programmer, you might find things a little more confusing.

For my part, I am wondering why the source encoding is separate from the IO object’s
internal encoding. Obviously the internal and external have to be different, but I
haven’t yet grasped the need to manipulate more than two encodings at a time…

If someone wants to enlighten me, feel free.