Violets in the spring…

I’m glad I posted my database comments in my previous blog entry… It was
a little strange, but other people at LSRC 6 echoed my concerns independently
of me.

Robert “Uncle Bob” Martin made similar comments about database separation
in his keynote talk, and Nick Sutterer also stressed this idea. I believe there was
also at least one other connection that I have forgotten.

It makes me think, now more than ever, that “database separation” is an idea
whose time has come. And by that I mean, it was always a sold principle. but
some of us have formed the habit in recent years of ignoring it.

Of course, there are multiple ways of implementing database separation. I do
think a new ORM might be necessary or at least appropriate. If people start 
moving in that direction, I will at least put in my 0.02 euros.

By the way: Who can tell me why I picked the title of this blog entry? Kudos
to the real geek who knows or at least can figure it out.

ORMs and database access

These are some notes I created recently. Comments welcome…

Thinking about data storage:
Databases, ORMs, and separation of concerns

Introduction

  1. ORMs are good — they help fix the disconnect between OOP and the database. But I don’t always like the way they work.
  2. Inheritance is very brute-force
    • What if you had to inherit from IO?
    • If you inherit from ActiveRecord::Base, you can’t inherit from anything else…
    • …unless you move the ActiveRecord inheritance to the top of the hierarchy, possibly even making descendants of classes that don’t need to be
    • ActiveRecord::Base adds 134 instance methods to your class
    • I also dislike AR’s lengthy method names and pathological use of method_missing
    • AR encourages higher-level thinking, but arguably not high enough
  3. I don’t like cluttering the model with database stuff
    • models should be object models, not database models
    • database logic distracts from the real object behavior
    • what if I want to change ORMs later?

A better way?

  1. Just as the ORM tries to hide the database, our code should try to hide the ORM
  2. Encapsulating behavior at a high level (abstracting) helps the situation
  3. I have played with AR and DataMapper and glanced at Ambition. My favorite so far is Sequel.
  4. But if I had time, I would develop yet another way. I once worked with Ezra Zygmuntowicz on “PassiveRecord” (not the one on github now), but we ran out of steam.
  5. Sub-digression: What’s the receiver?
    • Sometimes the receiver seems clear (but not always).
      dog.wag(tail) # not tail.wag(dog)
    • We do this:
      puts "Hello"; STDERR.puts "Something went wrong"
    • But we could have done this:
      "Hello".puts; "Something went wrong".puts(STDERR)
    • Marshal is consistent:
      str = Marshal.dump(obj); obj = Marshal.load(str)
    • YAML is not:
      str = obj.to_yaml; obj = YAML.load(str)
    • Note that YAML thus “pollutes” every object with (at least) a to_yaml method
    • To a much greater extent, ActiveRecord pollutes the object with class and instance methods
    • I do like saying obj.save, but I’m not sure it’s worth it
    • I think I’d rather say: db.save(obj)

Ideas and principles

  1. So some of my principles are:
    • Decouple the model from the ORM/database as much as possible
    • Add the fewest possible number of methods and attributes to the model’s class and instance
    • Centralize all information (including associations) in a kind of registry (data store)
    • As far as possible, hide even those details
  2. This blog post by Piotr Solnica is excellent:
    http://solnic.eu/2011/08/01/making-activerecord-models-thin.html

  3. And also Avdi Grimm’s comments, “The trouble with ActiveRecord” at:
    http://objectsonrails.com/#ID-317548a9-552e-47ce-9aac-5e8d656511fc

More ideas

  1. The Rails mantra “convention over configuration” is a good principle in general
  2. Let the table name default to the class name – forget plurals
  3. Append _id for the id and make it the primary key
  4. Let id be an alias for CLASS_id
  5. Let fields default to String type (the most common)
  6. Let xxx_id fields default to Integer (understood to be foreign keys)
  7. Unsure: Let xxx_at fields default to DateTime and be handled automatically
  8. Inheritance: Look up the child record by parent id
  9. Rubylike fields: Array, Hash, YAML, more? Non-queryable
  10. Concise notation for defining tables, associations
  11. has_one and belongs_to are not two associations, but one
  12. Determine inheritance structure through Ruby reflection
  13. After all metadata specified, discover relationships, add extra fields, and build tables

Symbols and Strings

I’ve been thinking about Symbols today.

In Ruby 1.4, they were basically fancy Fixnums. They were immutable and immediate,
and their values didn’t really matter. (Back then, as I recall, you could actually do a puts
of a symbol and get a numeric result. I don’t recall that there was a Symbol class, though
there may have been.)

Symbols are still immutable and immediate (though we don’t mentally associate them with
integers any more). Their semantics has changed little, except that in recent versions of
Ruby I believe they are stored on the heap and thus can be garbage-collected (to avoid an
exploit where an app might create an arbitrary number of symbols and force an out-of-memory
error).

But it occurs to me now that much of their use is stringlike. I do find myself converting back and
forth between symbols and strings fairly often (especially to strings — not so often the reverse).

Consider also mental “nearness” of strings and symbols. This has led many people to use the
Rails function with_indifferent_access — a practice I won’t support or decry here.

So I have a germ of an idea. A knowledgeable person may be able to shoot it down quickly –
the likes of Matz himself, Dave Thomas, and a dozen others whom I consider giants. Others,
despite lacking demigod status, may have useful points to make, or may have strong
opinions.  ;)

The idea is simply: Let Symbols be nothing more or less than immutable Strings. In fact, let
Symbol inherit from String.

No more with_indifferent_access. No excessive to_s and to_sym scattered everywhere. No
more asking: Does this return a Symbol or a String?

What do you think?

Open classes: Kids, don’t try this at home

I’ve always believed that Ruby had open classes for a reason. People rail about
the consequences of misusing this feature; but my response is: Then don’t
misuse it.

Some things in life, such as spoons, are difficult to use in a dangerous way. Others,
such as automobiles and free speech, can be very dangerous. The degree of caution
must be appropriate to the risk; but that is not to say such things should never be
used.

If I’m writing code strictly for my own use, especially in a self-contained one-off script,
I reopen classes as I see fit.

For example – the other day I was writing something with RMagick, and I found myself
wanting to manipulate geometric points (and especially to represent constants simply)
without any hassle. So I did this:

class Array
  def x
    raise "Not a point" unless size == 2
    self[0]
  end
  def x=(val)
    raise "Not a point" unless size == 2
    self[0] = val
  end
  def y
    raise "Not a point" unless size == 2
    self[1]
  end
  def y=(val)
    raise "Not a point" unless size == 2
    self[1] = val
  end
  def mid(other)
    raise "Not a point" unless size == 2
    raise "Other is not a point" unless other.size == 2
    [(self.x+other.x)/2.0, (self.y+other.y)/2.0]
  end
  def distance(other)
    raise "Not a point" unless size == 2
    raise "Other is not a point" unless other.size == 2
    dx = other.x - self.x
    dy = other.y - self.y
    Math.sqrt(dx**2 + dy**2)
  end
end

This may or may not appeal to you. It enables me to initialize points very
simply, e.g.  a = [3,5]  and still access a.x and a.y as needed.

It’s far from bulletproof. An array of strings, for example, will be considered
to be a “point” up to the time we do an illegal operation and the code blows
up. But in this context, for my purposes, it felt just right.

Encodings in Ruby 1.9

Lately I’ve been studying all the details of how encodings work. I think on the average,
things work better than before. Of course, if you’ve always been a “plain ASCII in the
USA” type of programmer, you might find things a little more confusing.

For my part, I am wondering why the source encoding is separate from the IO object’s
internal encoding. Obviously the internal and external have to be different, but I
haven’t yet grasped the need to manipulate more than two encodings at a time…

If someone wants to enlighten me, feel free.

Starting a new job…

I’m starting a new job tomorrow. Ruby-related, of course.

Check out Game Salad (http://gamesalad.com) and wish me
luck…

Blog.reboot!

Here we go again… You’d think that someone who loves to write as much as I do would be a prolific blogger. The problem is a lack of focus. I have my fingers in too many pies, and I have more pies than fingers.

But a friend has reminded me of the importance of blogging. So I am starting to do it again.

Here goes nothing… :)

Follow

Get every new post delivered to your Inbox.