We countered a very wield runtime error today, after migrated some data from a legacy database.
Because there is no change on the models, so we just create the table, and copied the data from the legacy database directly. To ensure the migration doesn’t break anything, we also wrote some migration test to verify the data integrality. And we found all tests are passed.
Everything looks perfect until the app goes live. We found the app crashes occasionally when we’re trying to create new data record in the system. Sometimes it works fine, but sometimes we got an error says “duplicate key value violates unique constraint ‘xxxxx_pkey’”.
It is wield because we’re really confident about our unit test and migration test. The problem must not related to migration and logic.
After some manually tests, we found we also got error when create entry with raw SQL Insert Query. So it seems to be a postgres issue. And the problem is caused because of the primary key, which is a auto-generated id.
Postgres introduces the Sequence to generate the auto-increase index. Sequence remember the last index number it generated, and calculate the new index by +1. During the data migration, we copy the data rows from another table to a new table. To keep the relationship between records, we also copied the primary key in the row. As a result, although we had inserted a number of records into the table, but the sequence binding to the primary key doesn’t been updated.
For example, we have inserted the following 3 entries:
{id: 1, name: ‘Jack’}
{id: 2, name: ‘Rose’}
{id: 4, name: ‘Hook’}
But because the id is also inserted, so the sequence is still at 1, so when we execute the following SQL: `
Insert entry
1
2
3
4
INSERTINTOusers (name)
VALUES ('Robinhood');
And sequence will generate 1 as the id, which is conflicted with entry {id: 1, name: 'Jack'}, and then database yield exception “duplicated key”. But usually the id is not continues because of deletion of the records, which looks like there are “holes” in the records. So our app can successfully insert entry successfully when new entry falls into the “hole”.
To solve this problem, we need to also update the sequences in the table, including the primary sequence. Postgres allow Sequence to be updated by using ALTER SEQUENCE command, and we can set the sequence to a big enough integer:
Update Sequence
1
2
3
ALTER SEQUENCE users_id_seq RESTART 10000
A smarter way is to query the whole table to find out the maximum id number, and set the sequence to that number + 1.
Guys familiar with Rails are very likely used to the following code, and will not be surprised by it:
ActiveRecord
1
2
3
4
5
6
classUser < ActiveRecord::Base
end
first_user = User.find(0)
But actually the code is not as simple as it looks like, especially for the ones from Java or C# world. In this piece of code, we can figure out that the class User inherited the method find from its parent class ActiveRecord::Base(If you are doubt or interested in how it works, you can check this post Ruby Class Inheritance).
If you write the following code, it should works fine:
Simple Class
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
classBase
defself.foo
bar_result = new.bar
"foo #{bar_result}"
end
defbar
'bar'
end
end
classDerived < Base
end
Base.new.bar.should == 'bar'
Derived.new.bar.should == 'bar'
Base.foo.should == "foo bar"
Derived.foo.should == "foo bar"
In Ruby’s world, most of the time you can replace a inheritance with a module mixin. So we try to refactor the code as following:
Exract to Module
1
2
3
4
5
6
7
8
9
10
11
12
13
14
moduleBase
defself.foo
bar_result = new.bar
"foo #{bar_result}"
end
defbar
'bar'
end
end
classDerived
include Base
end
If we run the tests again, the 2nd test will fail:
Test
1
2
3
4
Dervied.new.bar.should == 'bar'# Passed
Dervied.foo.should == 'foo bar'# Failed
The reason of the test failure is that the method ‘foo’ is not defined! So it is interesting, if we inherits the class, the class method of base class will be available on the subclass; but if we include a module, the class methods on the module will be available on the host class!
As we discussed before(Ruby Class Inheritance), the module mixed-in is equivalent to include insert a anonymous class with module’s instance methods into the ancestor chain of child class.
So is there any way to make all tests passed with module approach? The answer is yes absolutely but we need some tricky thing to make it happen:
I’m writing a cake script that helps me to build the growlStyle bundle. And I wish to my script can watch the change of the source file, and rebuild when file changed. So I wrote the code as following:
Watching code change
1
2
3
4
5
files = fs.readdirSync getLocalPath('source')
for file in files
fs.watch file, ->
console.log "File changed, rebuilding..."
build()
The code works when I edits the code with TextMate, but fails when I uses RubyMine!
Super weird!
After half an hour debugging, I found the following interesting phenomena:
Given I’m using TextMate When I changed the file 1st time Then a ‘change’ event is captured When I changed the file 2nd time Then a ‘change’ event is captured When I changed the file 3rd time Then a ‘change’ event is captured
Given I’m using RubyMine When I change the file 1st time Then a ‘rename’ event is captured When I changed the file 2nd time Then no event is captured When I changed the file 3rd time Then no event is captured
From the result, we can easily find out that the script fails is because “change” event is not triggered as expected when using RubyMine. And the reason of RubyMine’s “wried” behavior might be that RubyMine what to keep the file integrity so they “write” the file in an atomic way as following:
RubyMine write the file content to a temp file
RubyMine remove the original file
RubyMine rename the temp file to original file
This workflow ensures that the content is fully written or not written. So in a word, RubyMine does not actually write the file, it actually replace the original file with another one, and the original one is removed or stored to some special location.
And on the other hand, according to Node.js document of fs.watch, node uses kqueue on Mac to implement this behavior. And according to kqueue document, it uses file descriptor as identifier, and file descriptor is bound to the file itself rather than its path. So when the file is renamed, we will keep to track the file with new name. That’s why we lost the status of the file after the first ‘rename’ event. And in our case, we actually wish to identify the file by file path rather than by ‘file descriptor’.
To solve this issue, we have 2 potential solutions:
Also apply fs.watch to the directory that holds the source file besides of the source file itself. When the file is directly updated as TextMate does, the watcher on the file will raise the “change” event. When the file is atomically updated as RubyMine does, the watcher on the directory will raise 2 “rename” events. So theoretically, we could track the change of the file no matter how it is updated.
Use the old fashioned fs.watchFile function, which tracks the change the with fs.stat. Comparing to fs.watch, fs.watchFile is less efficient because its polling mechanism, but it does track the file with file name rather than file descriptor. So it won’t be charmed by the fancy atomic writing.
Obviously, the 1st solution looks better than the 2nd one, because its uses the event rather than old-fashioned polling. Even document of fs.watchFile also says that try to use fs.watch instead of fs.watchFile when possible.
But actually it is kind of painful to write such code, since ‘rename’ event on the directory is not only triggered by the file update, it also can be triggered by adding file and removing file.
And the ‘rename’ event will be triggered twice when updating the file. Obviously we cannot rebuild the code when the first ‘rename’ event fired, or the build might fail because of the absence of the file. And we will trigger the build twice in a really short period of time.
So in fact, to solve our problem, the polling fs.watchFile is more useful, its old-fashion protected itself being charmed by the ‘fancy’ atomic file writing.
HINT: Be careful about the differens of fs.watch and fs.watchFile:
The meaning of filename parameter The filename parameter of fs.watch is path sensitive, which accept ‘source.jade’ or ‘/path/to/source.jade’
The filename parameter of fs.watchFile isn’t path sensitive, which only accept ‘/path/to/source.jade’
Callback is invocation condition fs.watch invokes callback when the file is renamed or changed
fs.watchFile invokes callback when the file is accessed, including write and read. So you need to compare the mtime of the fstat, file is changed when mtime changed.
Response time fs.watch uses event, which captures the “change” almost in realtime.
fs.watchFile uses ‘polling’, which might differed for a period of time. By default, the maximum could be 5s.
Today, I found a interesting difference between padding and margin when I’m working on Metrics 2.0 Introduction page. There are several VideoThumbnail widget on the page, which contains a video snapshot and a paragraph text description. Here is the Html DOM of the widget, written in Haml:
Since the description could be very short or very long, so I make the div that contains the description scrollable, so I wrote the following stylesheet for caption div:
VideoThumbnail Widget Stylesheet
1
2
3
4
5
.caption {
padding: 9px;
height: 150px;
overflow-y: auto;
}
The style looks fine, and here is how it looks:
But very soon, I found the widget with scrollbar is taller than the one without it, it is because padding on 2 elements next to each other will not be merged: Red rect in following snapshot
It is caused because padding will not merged together as margin does, To solve the issue, I changed the padding to margin in the stylesheet:
VideoThumbnail Widget Stylesheet
1
2
3
4
5
6
.caption {
padding: 0;
margin: 9px;
height: 150px;
overflow-y: auto;
}
But bottom margin is corrected, but I found the scrollbar begin to occupy the space of content, which is not good! The center widget uses padding(Blue) and the right one uses margin(Red)
And I remember if I uses padding, the scrollbar takes the space of right padding; but if I use margin, it takes the space of the content. So I update the stylesheet in this way:
VideoThumbnail Widget Stylesheet
1
2
3
4
5
6
.caption {
padding: 09px00;
margin: 9px09px9px;
height: 150px;
overflow-y: auto;
}
I use padding on the right but uses margin on other side, so vertical scrollbar will take the right padding when necessary. It is a very interesting CSS trick, and it works fine under webkit based browser.
Nokogiri is a really popular Xml and Html library for Ruby. People loves Nokogiri is not just because it is powerful and fast, the most important is its flexible and convenient. Nokogiri works perfect in most aspects, but there is a big pitfall when handling the xml namespace!
I met a super weird issue when processing xml returned by Google Data API, and the API returns the following xml document:
I instantiated a Nokogiri::XML DOM with the xml document, and then I try to query the DOM with XPath: xml_dom.xpath '//entry':
Query DOM
1
2
xml_dom = Nokogiri::XML Faraday.get api_url
entries = xml_dom.xpath '//entry'
I’m expecting entries is an array with 4 elements, but actually it is empty array. After a few tries, I found the query yields empty array when I introduce the element name in the query.
Try Xpath Queries
1
2
3
4
5
xml_dom.xpath '.'# returns document
xml_dom.xapth '//.'# returns all element nodes
xml_dom.xpath '/feed'# returns empty array
xml_dom.xpath '//entry'# returns empty array
xml_dom.xpath '//media:group', 'media' => 'http://search.yahoo.com/mrss/'# returns 4 the media:group nodes
It is super weird.
After half an hour fighting against the Nokogiri, I begin to realize that it must be related to the namespace. And I found that there is an attribute applied to the root element of the document: xmlns="http://www.w3.org/2005/Atom", which means all the elements without explicit namespace declaration in the xml dom are under the namespace http://www.w3.org/2005/Atom by default.
And for some reason, the XPath query is namespace sensitive! It requires the full name rather than the local name, which means we should query the DOM with the code: xml_dom.xpath '//atom:entry', 'atom' => 'http://www.w3.org/2005/Atom'.
xml_dom.xpath '//media:group', 'media' => 'http://search.yahoo.com/mrss/'# returns 4 the media:group nodes
So in a sentence: XPath in Nokogiri doesn’t inherit the default namespace, so when query the DOM with default namespace, we need to explicitly specify the namespace in XPath query. It is really a hidden requirement and is very likely to be ignored by the developers!
So if there is no naming collision issue, it is recommeded to avoid this kind of “silly” issues by removing the namespaces in the DOM. Nokogiri::XML::Document class provides Nokogiri::XML::Document#remove_namespaces! method to achieve this goal.
I found the behavior of keyword def in ruby is really confusing! At least, really confusing to me! In most case, we use def in class context, then it defines a instance method on specific class.
Use def in class
1
2
3
4
5
6
7
8
9
classFoo
deffoo
:foo
end
$context = self
end
Foo.new.foo.should == :foo
$context.should == Foo
Besides the typical usage, we can also use def in block.
Use def in class_eval block
1
2
3
4
5
6
7
8
9
10
11
classFoo;end
Foo.class_eval do
deffoo
:foo
end
$context = self
end
Foo.new.foo.should == :foo
$context.should == Foo
This previous piece of code works as we reopened the class Foo, and add a new method to it. It is also not hard to understand.
The fact that really surprised me is in the following code:
Use def in instance_eval block
1
2
3
4
5
6
7
8
9
10
11
classFoo;end
Foo.instance_eval do
deffoo
:foo
end
$context = self
end
Foo.foo.should == :foo# Method foo goes into the Foo class itself rather than Foo's instance!
$context.should == Foo
Here we can found that method foo goes into the Foo class itself, rather than Foo‘s instance! But the $context is still Foo class!
So in a word, calling def foo in instance_eval block is equivalent to calling ‘def self.foo’ in class_eval block, even though the context of both block are the class itself. So we can figure out that keyword def works different than method define_method and define_singleton_method, since it doesn’t depend on self, but the two methods does!
To me it is kind of hard to understand. and confusing. And I think it is not a good design! Ruby is different to other Java or C#, ruby uses methods on class to take place of the keywords in other languages, such as public, protected and private. In most of the language, they are keywords. But in ruby they are actually the class methods of Class. This design is good, because it is kind of enabled the developer to extend the “keyword” they can use! But at the same time, this design melted the boundary between customizable methods and predefined keywords, so people won’t pay much attention to the difference of the two. So it is important to keep the consistency between methods and keyword behaviors. But def breaks the consistency, so it is confusing!
Look the following code:
def vs define_method
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
definition_block = Proc.new do
deffoo
:foo
end
define_method :bardo
:bar
end
end
classA;end
classB;end
A.class_eval &definition_block
B.instance_eval &definition_block
Comparing class A and class B, we can find that they are different, even they are defined with exactly same block!
Days ago, I post a blog about the ruby inheritance hierarchy. When discuss the topic with Yang Lin, he mentioned a crazy but interesting idea that introducing the prototype based OO into ruby. To introducing the prototype OO into ruby, Lin mentioned a possible approach is by using clone. But I’m not familiar with clone mechanism in ruby. So I tried another approach. Thanks to Ruby’s super powerful meta-programming mechanism, so I can forward the unknown message to prototype by using method_missing. And I encapsulate the code in a module, so every instance extended that module will obtain such capability.
Prototype Module
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
modulePrototype
definherit_from(prototype)
@prototype = prototype
self
end
defcreate_child
Object.new.extend(Prototype).inherit_from(self)
end
defrespond_to?(msg_id, priv = false)
returntrueifsuper
if @prototype
@prototype.respond_to?(msg_id, priv)
else
false
end
end
defmethod_missing(symbol, *args, &block)
if @prototype
@prototype.send(symbol, *args, &block)
else
super
end
end
defself.new
Object.new.extend(Prototype)
end
end
If I have the following code:
Prototype Inheritance
1
2
3
4
5
6
7
8
9
10
11
12
13
a = Object.new
defa.foo
'foo'
end
b = Object.new
b.extend(Prototype).inherit_from(a)
c = b.create_child
p b.foo # => 'foo'
p c.foo # => 'foo'
So b.foo and c.foo will yield ‘foo’.
And I can override the parent implementation by refine a method with the same name:
Prototype Overrides
1
2
3
4
5
6
7
8
9
10
11
defa.bar
'bar'
end
defc.bar
'c.bar'
end
p a.bar # => 'bar'
p b.bar # => 'bar'
p c.bar # => 'c.bar'
So I add a new singleton method bar in a, and b automatically inherits the method, and I override the bar on object c.
As a conclusion that we’re able to introduce the prototype based inheritance in ruby by using ruby’s powerful meta-programming mechanism. This implementation is only for concept-proof, so its performance is not quite good. But we can try to improve the performance by consolidating process by defining the method dynamically. The child object will query the parent for the first time, if invoking succeeded then it can consolidate the behavior into a method to avoid calling method_missing every time.
I just realize I have misunderstood the ruby “class methods” for quite a long time!
Here is a piece of code:
Instance Methods Inheritance
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
classA
deffoo
'foo'
end
end
classB < A
end
a = A.new
b = B.new
a.foo.should == 'foo'
b.foo.should == 'foo'
The previous piece of code demonstrated the typical inheritance mechanism in almost every Class-Object style OO language (There are a few exceptions, which are Prototype inheritance. Such as JavaScript, but it is also a miracle that whether Javascript is OO language XD). In most common OO languages, this is what inheritance about! But in Ruby, things is not that easy! Great thanks to Ruby’s eigen-class (aka Singleton class or Metaclass)
In ruby, I just found that derived class not just inherits the instance methods but also the class methods! It is kind of surprise to me!
Class Methods Inheritance
1
2
3
4
5
6
7
8
9
10
11
classA
defself.bar
'bar'
end
end
classB < A
end
A.bar.should == 'bar'
B.bar.should == 'bar'
For most people who knows Java or C# or even C++, who won’t be surprised about A.bar.should == 'bar', but you might feel surprised about B.bar.should == 'bar' like I do.
To me, bar is declared on class A, B inherits A, than I can even call method declared on class A on class B! It is amazing!
Since in ruby, “class method” is actually the instance method of the eigen-class of the class. And def self.foo is just a syntax sugar. So we can rewrite the code as:
Rewriten Class Methods Inheritance
1
2
3
4
5
6
7
8
9
10
11
12
13
14
classA
end
classB < A
end
class << A
defbar
'bar'
end
end
A.bar.should == 'bar'
B.bar.should == 'bar'
If we call A’s eigen-class AA, and B’s eigen-class BB. Then we will found that BB.superclass == AA
BB and AA
1
2
3
4
5
6
7
8
9
10
11
classA;end
classB < A;end
AA = class << A;self; end
BB = class << B;self; end
B.superclass.should == A
BB.superclass.should == AA
And we know A is actually an instance of AA, and B is an instance of BB, so obviously on B we can call the instance methods defined on AA. That’s the reason why class method in Ruby can be inherited!
But there are so inconsistency in Ruby, that AA is the superclass of BB, but you won’t be able to found AA in BB‘s ancestors! In fact, BB.ancestors might yield something similar to [Class, Module, Object, BasicObject, Kernel] if not any module is injected to Class, Module, Object
This design is wield to me, and kind of hard to understand, so for quite a long time, I don’t even know class methods in ruby can be inherited! I drew a graph to show the relationship about the classes, in graph I use <class:A> to indicate the class is the eigen class of A. And the line with a empty triangle to represents the inheritance, and arrow line to represents the instantiation. And this graph is not a complete one, I omitted some unimportant classes, and I uses the dot line to indicate that something is missing on the line.
Today we met a weird problem when return a string via json.
Here is the coffee script code:
Front End
1
2
3
4
$.post serverUrl, data, (status) ->
console.log status
And here is our controller:
Backend Action
1
2
3
4
5
6
7
defaction
# do some complex logic
render json:"success"
end
Code looks perfect, but we found that the callback is never called! When we check the network traffic, you will found that server does send its response “success”, but the callback is not called!
After spending half an hour to struggle against the jQuery, we finally find the problem!
The reason is that success is not a valid json data! A valid json string should be quoted with “”, or JSON parser will treat it as token, like true or false or nil.
So to fix the problem, we need to change our action code:
Fixed Backend Action
1
2
3
4
5
6
7
defaction
# do some complex logic
render json:'"success"'
end
This is really a pitfall, since the wrong code looks so nature!
Thanks to Ruby powerful meta programming capability and Rails delegate syntax, we can easily write graceful singleton class which makes the class works like a instance.
In traditional language such as C#, usually we write singleton code like this:
Singleton in C##
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
class Foo
{
// Singleton Declaration
privatestatic readonly Foo instance;
pubilc static Foo Instance
{
get
{
if(instance == null)
{
instance = new Foo();
}
return instance;
}
}
// Define instance behaviors
// ...
}
The previous approach works fine but the code that uses Foo will be kind of ugly. Every time when we want to invoke the method Bar on Foo, we need to write Foo.Instance.Bar() rather than more graceful way Foo.Bar(). To solve this problem we need implement the class in this way:
Class Delegation in C##
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
class Foo
{
// Singleton Declaration
// ...
// Define instance behaviors
publicvoidBar()
{
// Bar behaviors
// ...
}
publicstaticvoidBar()
{
Instance.Bar();
}
publicstring Baz
{
get { /* Getter behavior */ }
set { /* Setter behavior */ }
}
publicstaticstring Baz
{
get { return Instance.Baz; }
set { Instance.Baz = value; }
}
}
This approach simplified the caller code but complicated the declaration. You can use some trick such as Code-Snippet or code generating technology such as Text Template or CodeSmith to generate the dull delegation code. But it is still not graceful at all.
If we write same code in ruby, things become much easier, great thanks to Ruby’s powerful meta programming capability.
So in ruby solution we just use one statement delegate *Foo::Base.instance_methods, :to => :instance then delegate all methods defined in base to instance.
Besides this solution, there is also another kind of cheaper but working solution:
Singleton in Ruby
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# foo.rb
classFoo
autoload :Base
include Base
extend Base
end
# foo/base.rb
moduleFoo::Base
# Define instance behaviors
# ...
end
Two different approaches make the code behaves slightly different, but anyway they both works.