Ruby Class Inheritance II: Differences between inheritance and mixin

Guys familiar with Rails are very likely used to the following code, and will not be surprised by it:

ActiveRecord
1
2
3
4
5
6
class User < ActiveRecord::Base
end
first_user = User.find(0)

But actually the code is not as simple as it looks like, especially for the ones from Java or C# world.
In this piece of code, we can figure out that the class User inherited the method find from its parent class ActiveRecord::Base(If you are doubt or interested in how it works, you can check this post Ruby Class Inheritance).

If you write the following code, it should works fine:

Simple Class
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
class Base
def self.foo
bar_result = new.bar
"foo #{bar_result}"
end
def bar
'bar'
end
end
class Derived < Base
end
Base.new.bar.should == 'bar'
Derived.new.bar.should == 'bar'
Base.foo.should == "foo bar"
Derived.foo.should == "foo bar"

In Ruby’s world, most of the time you can replace a inheritance with a module mixin. So we try to refactor the code as following:

Exract to Module
1
2
3
4
5
6
7
8
9
10
11
12
13
14
module Base
def self.foo
bar_result = new.bar
"foo #{bar_result}"
end
def bar
'bar'
end
end
class Derived
include Base
end

If we run the tests again, the 2nd test will fail:

Test
1
2
3
4
Dervied.new.bar.should == 'bar' # Passed
Dervied.foo.should == 'foo bar' # Failed

The reason of the test failure is that the method ‘foo’ is not defined!
So it is interesting, if we inherits the class, the class method of base class will be available on the subclass; but if we include a module, the class methods on the module will be available on the host class!

As we discussed before(Ruby Class Inheritance), the module mixed-in is equivalent to include insert a anonymous class with module’s instance methods into the ancestor chain of child class.

So is there any way to make all tests passed with module approach? The answer is yes absolutely but we need some tricky thing to make it happen:

Exract to Module ver 2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
module Base
module ClassMethods
def foo
bar_result = new.bar
"foo #{bar_result}"
end
end
def bar
'bar'
end
private
def self.included(mod)
mode.extend ClassMethods
end
end
class Derived
include Base
end
Dervied.new.bar.should == 'bar' # Passed
Dervied.foo.should == 'foo bar' # Passed

Pitfall in fs.watch: fs.watch fails when switch from TextMate to RubyMine

I’m writing a cake script that helps me to build the growlStyle bundle.
And I wish to my script can watch the change of the source file, and rebuild when file changed.
So I wrote the code as following:

Watching code change
1
2
3
4
5
files = fs.readdirSync getLocalPath('source')
for file in files
fs.watch file, ->
console.log "File changed, rebuilding..."
build()

The code works when I edits the code with TextMate, but fails when I uses RubyMine!

Super weird!

After half an hour debugging, I found the following interesting phenomena:

  • Given I’m using TextMate
    When I changed the file 1st time
    Then a ‘change’ event is captured
    When I changed the file 2nd time
    Then a ‘change’ event is captured
    When I changed the file 3rd time
    Then a ‘change’ event is captured

  • Given I’m using RubyMine
    When I change the file 1st time
    Then a ‘rename’ event is captured
    When I changed the file 2nd time
    Then no event is captured
    When I changed the file 3rd time
    Then no event is captured

From the result, we can easily find out that the script fails is because “change” event is not triggered as expected when using RubyMine.
And the reason of RubyMine’s “wried” behavior might be that RubyMine what to keep the file integrity so they “write” the file in an atomic way as following:

  1. RubyMine write the file content to a temp file
  2. RubyMine remove the original file
  3. RubyMine rename the temp file to original file

This workflow ensures that the content is fully written or not written. So in a word, RubyMine does not actually write the file, it actually replace the original file with another one, and the original one is removed or stored to some special location.

And on the other hand, according to Node.js document of fs.watch, node uses kqueue on Mac to implement this behavior.
And according to kqueue document, it uses file descriptor as identifier, and file descriptor is bound to the file itself rather than its path. So when the file is renamed, we will keep to track the file with new name. That’s why we lost the status of the file after the first ‘rename’ event.
And in our case, we actually wish to identify the file by file path rather than by ‘file descriptor’.

To solve this issue, we have 2 potential solutions:

  1. Also apply fs.watch to the directory that holds the source file besides of the source file itself.
    When the file is directly updated as TextMate does, the watcher on the file will raise the “change” event.
    When the file is atomically updated as RubyMine does, the watcher on the directory will raise 2 “rename” events.
    So theoretically, we could track the change of the file no matter how it is updated.

  2. Use the old fashioned fs.watchFile function, which tracks the change the with fs.stat.
    Comparing to fs.watch, fs.watchFile is less efficient because its polling mechanism, but it does track the file with file name rather than file descriptor. So it won’t be charmed by the fancy atomic writing.

Obviously, the 1st solution looks better than the 2nd one, because its uses the event rather than old-fashioned polling. Even document of fs.watchFile also says that try to use fs.watch instead of fs.watchFile when possible.

But actually it is kind of painful to write such code, since ‘rename’ event on the directory is not only triggered by the file update, it also can be triggered by adding file and removing file.

And the ‘rename’ event will be triggered twice when updating the file. Obviously we cannot rebuild the code when the first ‘rename’ event fired, or the build might fail because of the absence of the file. And we will trigger the build twice in a really short period of time.

So in fact, to solve our problem, the polling fs.watchFile is more useful, its old-fashion protected itself being charmed by the ‘fancy’ atomic file writing.

So finally, we got the following code:

fs.watchFile
1
2
3
4
5
6
7
8
9
10
11
runInWatch = (options, task) ->
action(options) unless options.watch
console.info "INFO: Watching..."
files = fs.readdirSync getLocalPath('source')
console.log '"Tracking files:'
for file in files
console.log "#{file}"
fs.watchFile getLocalPath('source', file), (current, previous) ->
unless current.mtime == previous.mtime
console.log "#{file} Changed..."
task(options)

HINT: Be careful about the differens of fs.watch and fs.watchFile:

  • The meaning of filename parameter
    The filename parameter of fs.watch is path sensitive, which accept ‘source.jade’ or ‘/path/to/source.jade’ The filename parameter of fs.watchFile isn’t path sensitive, which only accept ‘/path/to/source.jade’
  • Callback is invocation condition
    fs.watch invokes callback when the file is renamed or changed fs.watchFile invokes callback when the file is accessed, including write and read.
    So you need to compare the mtime of the fstat, file is changed when mtime changed.
  • Response time
    fs.watch uses event, which captures the “change” almost in realtime. fs.watchFile uses ‘polling’, which might differed for a period of time. By default, the maximum could be 5s.

CSS trick: Place Scrollbar outside of the client area

Today, I found a interesting difference between padding and margin when I’m working on Metrics 2.0 Introduction page. There are several VideoThumbnail widget on the page, which contains a video snapshot and a paragraph text description.
Here is the Html DOM of the widget, written in Haml:

VideoThumbnail Widget Html
1
2
3
4
5
6
7
8
9
%li.span4
%a.thumbnail.new(data-widget="IntroductionPage.VideoThumbnail")
.snapshot-container
%img.snapshot{ src: snapshot }
%img.status(src="/assets/new.png" )
.caption
.title
%h3 #{index}. #{title}
%p.description #{description}

Since the description could be very short or very long, so I make the div that contains the description scrollable, so I wrote the following stylesheet for caption div:

VideoThumbnail Widget Stylesheet
1
2
3
4
5
.caption {
padding: 9px;
height: 150px;
overflow-y: auto;
}

The style looks fine, and here is how it looks:

Wiget

But very soon, I found the widget with scrollbar is taller than the one without it, it is because padding on 2 elements next to each other will not be merged: Red rect in following snapshot

Padding

It is caused because padding will not merged together as margin does, To solve the issue, I changed the padding to margin in the stylesheet:

VideoThumbnail Widget Stylesheet
1
2
3
4
5
6
.caption {
padding: 0;
margin: 9px;
height: 150px;
overflow-y: auto;
}

But bottom margin is corrected, but I found the scrollbar begin to occupy the space of content, which is not good! The center widget uses padding(Blue) and the right one uses margin(Red)

Margin

And I remember if I uses padding, the scrollbar takes the space of right padding; but if I use margin, it takes the space of the content. So I update the stylesheet in this way:

VideoThumbnail Widget Stylesheet
1
2
3
4
5
6
.caption {
padding: 0 9px 0 0;
margin: 9px 0 9px 9px;
height: 150px;
overflow-y: auto;
}
Padding Margin Mixing

I use padding on the right but uses margin on other side, so vertical scrollbar will take the right padding when necessary. It is a very interesting CSS trick, and it works fine under webkit based browser.

Pitfall in Nokogiri XPath and Namespace

Nokogiri is a really popular Xml and Html library for Ruby. People loves Nokogiri is not just because it is powerful and fast, the most important is its flexible and convenient.
Nokogiri works perfect in most aspects, but there is a big pitfall when handling the xml namespace!

I met a super weird issue when processing xml returned by Google Data API, and the API returns the following xml document:

API response
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" xmlns:yt="http://gdata.youtube.com/schemas/2007">
<entry>
<media:group>(...)</media:group>
<yt:position>1</yt:position>
</entry>
<entry>
<media:group>(...)</media:group>
<yt:position>2</yt:position>
</entry>
<entry>
<media:group>(...)</media:group>
<yt:position>3</yt:position>
</entry>
<entry>
<media:group>(...)</media:group>
<yt:position>4</yt:position>
</entry>
</feed>

I instantiated a Nokogiri::XML DOM with the xml document, and then I try to query the DOM with XPath: xml_dom.xpath '//entry':

Query DOM
1
2
xml_dom = Nokogiri::XML Faraday.get api_url
entries = xml_dom.xpath '//entry'

I’m expecting entries is an array with 4 elements, but actually it is empty array. After a few tries, I found the query yields empty array when I introduce the element name in the query.

Try Xpath Queries
1
2
3
4
5
xml_dom.xpath '.' # returns document
xml_dom.xapth '//.' # returns all element nodes
xml_dom.xpath '/feed' # returns empty array
xml_dom.xpath '//entry' # returns empty array
xml_dom.xpath '//media:group', 'media' => 'http://search.yahoo.com/mrss/' # returns 4 the media:group nodes

It is super weird.

After half an hour fighting against the Nokogiri, I begin to realize that it must be related to the namespace.
And I found that there is an attribute applied to the root element of the document: xmlns="http://www.w3.org/2005/Atom", which means all the elements without explicit namespace declaration in the xml dom are under the namespace http://www.w3.org/2005/Atom by default.

And for some reason, the XPath query is namespace sensitive! It requires the full name rather than the local name, which means we should query the DOM with the code: xml_dom.xpath '//atom:entry', 'atom' => 'http://www.w3.org/2005/Atom'.

Fixed XPath Queries
1
2
3
4
5
xml_dom.xpath '.' # returns document
xml_dom.xapth '//.' # returns all element nodes
xml_dom.xpath '/atom:feed', 'atom' => 'http://www.w3.org/2005/Atom' # returns root node
xml_dom.xpath '//atom:entry', 'atom' => 'http://www.w3.org/2005/Atom' # returns 4 entry nodes
xml_dom.xpath '//media:group', 'media' => 'http://search.yahoo.com/mrss/' # returns 4 the media:group nodes

So in a sentence: XPath in Nokogiri doesn’t inherit the default namespace, so when query the DOM with default namespace, we need to explicitly specify the namespace in XPath query. It is really a hidden requirement and is very likely to be ignored by the developers!

So if there is no naming collision issue, it is recommeded to avoid this kind of “silly” issues by removing the namespaces in the DOM. Nokogiri::XML::Document class provides Nokogiri::XML::Document#remove_namespaces! method to achieve this goal.

Weird! "def" behaves different in class_eval and instance_eval

I found the behavior of keyword def in ruby is really confusing! At least, really confusing to me!
In most case, we use def in class context, then it defines a instance method on specific class.

Use def in class
1
2
3
4
5
6
7
8
9
class Foo
def foo
:foo
end
$context = self
end
Foo.new.foo.should == :foo
$context.should == Foo

Besides the typical usage, we can also use def in block.

Use def in class_eval block
1
2
3
4
5
6
7
8
9
10
11
class Foo; end
Foo.class_eval do
def foo
:foo
end
$context = self
end
Foo.new.foo.should == :foo
$context.should == Foo

This previous piece of code works as we reopened the class Foo, and add a new method to it. It is also not hard to understand.

The fact that really surprised me is in the following code:

Use def in instance_eval block
1
2
3
4
5
6
7
8
9
10
11
class Foo; end
Foo.instance_eval do
def foo
:foo
end
$context = self
end
Foo.foo.should == :foo # Method foo goes into the Foo class itself rather than Foo's instance!
$context.should == Foo

Here we can found that method foo goes into the Foo class itself, rather than Foo‘s instance! But the $context is still Foo class!

So in a word, calling def foo in instance_eval block is equivalent to calling ‘def self.foo’ in class_eval block, even though the context of both block are the class itself.
So we can figure out that keyword def works different than method define_method and define_singleton_method, since it doesn’t depend on self, but the two methods does!

To me it is kind of hard to understand. and confusing. And I think it is not a good design!
Ruby is different to other Java or C#, ruby uses methods on class to take place of the keywords in other languages, such as public, protected and private. In most of the language, they are keywords. But in ruby they are actually the class methods of Class.
This design is good, because it is kind of enabled the developer to extend the “keyword” they can use! But at the same time, this design melted the boundary between customizable methods and predefined keywords, so people won’t pay much attention to the difference of the two. So it is important to keep the consistency between methods and keyword behaviors. But def breaks the consistency, so it is confusing!

Look the following code:

def vs define_method
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
definition_block = Proc.new do
def foo
:foo
end
define_method :bar do
:bar
end
end
class A; end
class B; end
A.class_eval &definition_block
B.instance_eval &definition_block

Comparing class A and class B, we can find that they are different, even they are defined with exactly same block!

Introduce Prototype Style OO inheritance in Ruby

Days ago, I post a blog about the ruby inheritance hierarchy. When discuss the topic with Yang Lin, he mentioned a crazy but interesting idea that introducing the prototype based OO into ruby.
To introducing the prototype OO into ruby, Lin mentioned a possible approach is by using clone. But I’m not familiar with clone mechanism in ruby. So I tried another approach.
Thanks to Ruby’s super powerful meta-programming mechanism, so I can forward the unknown message to prototype by using method_missing. And I encapsulate the code in a module, so every instance extended that module will obtain such capability.

Prototype Module
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
module Prototype
def inherit_from(prototype)
@prototype = prototype
self
end
def create_child
Object.new.extend(Prototype).inherit_from(self)
end
def respond_to?(msg_id, priv = false)
return true if super
if @prototype
@prototype.respond_to?(msg_id, priv)
else
false
end
end
def method_missing(symbol, *args, &block)
if @prototype
@prototype.send(symbol, *args, &block)
else
super
end
end
def self.new
Object.new.extend(Prototype)
end
end

If I have the following code:

Prototype Inheritance
1
2
3
4
5
6
7
8
9
10
11
12
13
a = Object.new
def a.foo
'foo'
end
b = Object.new
b.extend(Prototype).inherit_from(a)
c = b.create_child
p b.foo # => 'foo'
p c.foo # => 'foo'

So b.foo and c.foo will yield ‘foo’.

And I can override the parent implementation by refine a method with the same name:

Prototype Overrides
1
2
3
4
5
6
7
8
9
10
11
def a.bar
'bar'
end
def c.bar
'c.bar'
end
p a.bar # => 'bar'
p b.bar # => 'bar'
p c.bar # => 'c.bar'

So I add a new singleton method bar in a, and b automatically inherits the method, and I override the bar on object c.

As a conclusion that we’re able to introduce the prototype based inheritance in ruby by using ruby’s powerful meta-programming mechanism. This implementation is only for concept-proof, so its performance is not quite good. But we can try to improve the performance by consolidating process by defining the method dynamically. The child object will query the parent for the first time, if invoking succeeded then it can consolidate the behavior into a method to avoid calling method_missing every time.

Ruby Class Inheritance

I just realize I have misunderstood the ruby “class methods” for quite a long time!

Here is a piece of code:

Instance Methods Inheritance
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class A
def foo
'foo'
end
end
class B < A
end
a = A.new
b = B.new
a.foo.should == 'foo'
b.foo.should == 'foo'

The previous piece of code demonstrated the typical inheritance mechanism in almost every Class-Object style OO language (There are a few exceptions, which are Prototype inheritance. Such as JavaScript, but it is also a miracle that whether Javascript is OO language XD).
In most common OO languages, this is what inheritance about! But in Ruby, things is not that easy! Great thanks to Ruby’s eigen-class (aka Singleton class or Metaclass)

In ruby, I just found that derived class not just inherits the instance methods but also the class methods! It is kind of surprise to me!

Class Methods Inheritance
1
2
3
4
5
6
7
8
9
10
11
class A
def self.bar
'bar'
end
end
class B < A
end
A.bar.should == 'bar'
B.bar.should == 'bar'

For most people who knows Java or C# or even C++, who won’t be surprised about A.bar.should == 'bar', but you might feel surprised about B.bar.should == 'bar' like I do.

To me, bar is declared on class A, B inherits A, than I can even call method declared on class A on class B! It is amazing!

Since in ruby, “class method” is actually the instance method of the eigen-class of the class. And def self.foo is just a syntax sugar. So we can rewrite the code as:

Rewriten Class Methods Inheritance
1
2
3
4
5
6
7
8
9
10
11
12
13
14
class A
end
class B < A
end
class << A
def bar
'bar'
end
end
A.bar.should == 'bar'
B.bar.should == 'bar'

If we call A’s eigen-class AA, and B’s eigen-class BB. Then we will found that BB.superclass == AA

BB and AA
1
2
3
4
5
6
7
8
9
10
11
class A; end
class B < A; end
AA = class << A; self; end
BB = class << B; self; end
B.superclass.should == A
BB.superclass.should == AA

And we know A is actually an instance of AA, and B is an instance of BB, so obviously on B we can call the instance methods defined on AA.
That’s the reason why class method in Ruby can be inherited!

But there are so inconsistency in Ruby, that AA is the superclass of BB, but you won’t be able to found AA in BB‘s ancestors! In fact, BB.ancestors might yield something similar to [Class, Module, Object, BasicObject, Kernel] if not any module is injected to Class, Module, Object

Inconsistency
1
2
3
4
5
6
7
8
9
10
11
12
class A; end
class B < A; end
AA = class << A; self; end
BB = class << B; self; end
BB.superclass.should == AA
BB.ancestors.should_not includes AA
# BB.ancestors == [Class, Module, Object, BasicObject, Kernel]

This design is wield to me, and kind of hard to understand, so for quite a long time, I don’t even know class methods in ruby can be inherited!
I drew a graph to show the relationship about the classes, in graph I use <class:A> to indicate the class is the eigen class of A. And the line with a empty triangle to represents the inheritance, and arrow line to represents the instantiation.
And this graph is not a complete one, I omitted some unimportant classes, and I uses the dot line to indicate that something is missing on the line.

Inheritance Hierarchy

pitfall when return string in via json in rails

Today we met a weird problem when return a string via json.

Here is the coffee script code:

Front End
1
2
3
4
$.post serverUrl, data, (status) ->
console.log status

And here is our controller:

Backend Action
1
2
3
4
5
6
7
def action
# do some complex logic
render json: "success"
end

Code looks perfect, but we found that the callback is never called! When we check the network traffic, you will found that server does send its response “success”, but the callback is not called!

After spending half an hour to struggle against the jQuery, we finally find the problem!

The reason is that success is not a valid json data! A valid json string should be quoted with “”, or JSON parser will treat it as token, like true or false or nil.

So to fix the problem, we need to change our action code:

Fixed Backend Action
1
2
3
4
5
6
7
def action
# do some complex logic
render json: '"success"'
end

This is really a pitfall, since the wrong code looks so nature!

Pretty Singleton in RoR app

Thanks to Ruby powerful meta programming capability and Rails delegate syntax, we can easily write graceful singleton class which makes the class works like a instance.

In traditional language such as C#, usually we write singleton code like this:

Singleton in C##
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
class Foo
{
// Singleton Declaration
private static readonly Foo instance;
pubilc static Foo Instance
{
get
{
if(instance == null)
{
instance = new Foo();
}
return instance;
}
}
// Define instance behaviors
// ...
}

The previous approach works fine but the code that uses Foo will be kind of ugly. Every time when we want to invoke the method Bar on Foo, we need to write Foo.Instance.Bar() rather than more graceful way Foo.Bar().
To solve this problem we need implement the class in this way:

Class Delegation in C##
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
class Foo
{
// Singleton Declaration
// ...
// Define instance behaviors
public void Bar()
{
// Bar behaviors
// ...
}
public static void Bar()
{
Instance.Bar();
}
public string Baz
{
get { /* Getter behavior */ }
set { /* Setter behavior */ }
}
public static string Baz
{
get { return Instance.Baz; }
set { Instance.Baz = value; }
}
}

This approach simplified the caller code but complicated the declaration. You can use some trick such as Code-Snippet or code generating technology such as Text Template or CodeSmith to generate the dull delegation code. But it is still not graceful at all.

If we write same code in ruby, things become much easier, great thanks to Ruby’s powerful meta programming capability.

Singleton in Ruby
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# foo.rb
class Foo
extend ActiveSupport::Autoload
autoload :Base
include Base
autoload :ClassMethods
extend ClassMethods
end
# foo/base.rb
module Foo::Base
# Define instance behaviors
# ...
end
# foo/class_methods.rb
module Foo::ClassMethods
# Singleton Declaration
def instance
@instance ||= new
end
delegate *Foo::Base.instance_methods, :to => :instance
end

So in ruby solution we just use one statement delegate *Foo::Base.instance_methods, :to => :instance then delegate all methods defined in base to instance.

Besides this solution, there is also another kind of cheaper but working solution:

Singleton in Ruby
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# foo.rb
class Foo
autoload :Base
include Base
extend Base
end
# foo/base.rb
module Foo::Base
# Define instance behaviors
# ...
end

Two different approaches make the code behaves slightly different, but anyway they both works.

Extend RSpec DSL

I’m working on a project that need some complicated html snippets for test, which cannot be easily generated with factory. So I put these snippets into fixture files.

RSpec provides a very convenient DSL keyword let, which allow us to define something for test and cached it in the same test. And I want I could have some similar keyword for my html fixtures. To achieve this goal I decide to extend DSL.

So I created module which contains the new DSL I want to have:

DSL module
1
2
3
4
5
6
7
8
9
10
11
# spec/support/html_pages.rb
module HtmlPages
def load_html(name)
let name do
file = Rails.root.join('spec/html_pages', "#{name}.html")
Nokogiri::Html File.read(file)
end
end
end

Put this file into the path spec/support, by default, spec_helper.rb would require this file for you.
Then we should tell rspec to load the DSL into test cases.

Load DSL
1
2
3
4
5
RSpec.configure do |config|
# ...
config.extend HtmlPages
# ...
end

By telling config to extend the module, our DSL will be loaded as the class methods of RSpec::Core::ExampleGroup, where let is being defined.

HINT: Rspec config has another way to extend DSL by calling config.include. Then the DSL methods will be injected into the test example group instance, then these methods can be used in the test cases. That’s how runtime DSLs like FactoryGirl work.