Logo

dev-resources.site

for different kinds of informations.

Reading the Ruby 3.4 NEWS with professionals (English translation)

Published at
12/27/2024
Categories
Author
Koichi Sasada
Categories
1 categories in total
open
Reading the Ruby 3.4 NEWS with professionals (English translation)

This article is Japanese->English translation of プロと読み解くRuby 3.4 NEWS - STORES Product Blog, mainly using DeepL.

This is Koichi Sasada (ko1) and Yusuke Endoh (mame) from the Technology Infrastructure Group in the Technology Division at STORES, Inc. We are developing the Ruby interpreter (MRI: Matz Ruby Implementation, or the ruby command). We are professional Ruby committers because we are paid to develop Ruby.

Ruby 3.4.0 was released today, December 25th, as the annual Christmas release (Ruby 3.4.0 Release). This year, we will also be explaining the NEWS.md file for Ruby 3.4 on the STORES Product Blog (incidentally, this will be an article for the STORES Advent Calendar 2024, written in Japanese). Please see the previous article for an explanation of what a NEWS file is.

This article not only explains the new features, but also describes the background to the changes and the difficulties involved, to the best of our recollection.

Ruby 3.4 saw almost no changes to the language (grammar). This was because the decision was made to hold off on any grammar changes in order to focus on the changing the parser that interprets the grammar to Prism.

The main changes in Ruby 3.4 are as follows (excerpt from the release notes)

  • it is added
  • The default parser is changed to Prism
  • The Socket library is updated to support Happy Eyeballs Version 2 (RFC 8305)
  • YJIT is updated
  • Modular GC

In this article, we will introduce the items in the NEWS file, including these.

Language Changes

Block parameter it is available

The it reference to a block parameter without naming it has finally been introduced.

ary = ["foo", "bar", "baz"]

p ary.map { it.upcase }  #=> ["FOO", "BAR", "BAZ"]

it makes it easier to write simple blocks that are easier to read.

Difference from numbered parameter

it is roughly the same as _1 in numbered parameter. However, the restrictions on where it can be written are a little more relaxed.

"foo".then do
  p "bar".then { _1.upcase } #=> "BAR"
  p # A syntax error if you write _1 here
end

"foo".then do
  p "bar".then { it.upcase } #=> "BAR"
  p it.upcase #=> "FOO" # this is allowed with `it`
end

Even though it is allowed, it is better to avoid using it in anything other than one-line blocks, as it makes it difficult to understand what it refers to.

Also, while numbered parameters allow you to access the n-th argument with _2 or _3, it does not have such a function.

By the way, it is not allowed to refer to it and numbered parameters in the same block.

[1, 2, 3].then { p [it, _1, _2] } # This is a syntax error

Background to the introduction of it

When numbered parameters were introduced, there were a number of proposals for how to write them, and it was one of them [Feature #15897]. Or rather, I was the one who proposed it.

There were quite a few people who liked the name it, but there were compatibility issues. The method name it is used a lot in existing code (mainly RSpec), so if it was simply made into a keyword, it would cause a fatal incompatibility. To avoid this, we invented a tricky trick of “making the keyword only for the call of it with no arguments”.

However, it does not have a function to read the n-th argument, and because we were not sure whether that was okay, the notation _1 was eventually adopted.

After that, as people started to use numbered parameters in practice, it became clear that the function to read the second and subsequent arguments was not used very much, and that _1 was overwhelmingly more commonly used.

_1, especially, the number part "1", had a cognitively taxing issue. Therefore, it was decided to introduce it as a more visually friendly alternative to _1. In Ruby 3.3, the un-parameterized call to the method it was warned to facilitate the transition, and it is now available in Ruby 3.4.

(mame)

frozen_string_literal will be default and a migration path is added

  • String literals in files without a frozen_string_literal comment now emit a deprecation warning when they are mutated. These warnings can be enabled with -W:deprecated or by setting Warning[:deprecated] = true. To disable this change, you can run Ruby with the --disable-frozen-string-literal command line argument. [Feature #20205]

There used to be a feature called frozen_string_literal that would freeze string literals, and this will be the default. However, this will not be the case in Ruby 3.4, but rather in a later version (so it's “in the direction of”).

As a concrete change in Ruby 3.4, when you run with the -w or -w:deprecated option (when you enable deprecated warnings), you will now get warnings about changes (destructive operations) to string literals.

$ ruby -W:deprecated -e '"" << ?a'
-e:1: warning: literal string will be frozen in the future (run with --debug-frozen-string-literal for more information)

As the warning message says, if you add --debug-frozen-string-literal, it will display where the string that was subject to the destructive operation was created.

$ ruby -W:deprecated --debug-frozen-string-literal -e 's = ""
s << ?a'
-e:2: warning: literal string will be frozen in the future
  # warning because the destructive operation is performed on the second line
-e:1: info: the string was created here
  # displays that it is for the string created on the first line

As before, the frozen_string_literal pragma is available.

$ ruby -W:deprecated -e '# frozen_string_literal: true
> "" << ?a'
-e:2:in '<main>': can't modify frozen String: "", created at -e:2 (FrozenError)

And if you specify # frozen_string_literal: false, which was the default until now and probably no one applied, this warning will no longer be displayed. In other words, it is a declaration that string literals are not frozen.

$ ruby -W:deprecated -e '# frozen_string_literal: false
"" << ?a'

If you want to specify this behavior for the entire application, specify --enable-frozen-string-literal or --disable-frozen-string-literal.

$ ruby -W:deprecated --enable-frozen-string-literal -e 'a = ""
a << ?a'
-e:2:in '<main>': can't modify frozen String: "" (FrozenError)

$ ruby -W:deprecated --disable-frozen-string-literal -e 'a = ""
a << ?a'
# No warnings

In other words, if you specify --disable-frozen-string-literal, the warning will not be displayed in the same way as before.

As the warning “warning: literal string will be frozen in the future” indicates, the destructive operation on the frozen string literals will cause an error. Looking at the ticket, it says that it will be changed to an error from Ruby 4.0. Well, it's just a plan, so we don't know what will happen.

Internally, strings that issue a warning when a destructive operation is performed (and will become frozen in the future) are called "Chilled Strings", but this is knowledge that is not necessary if you are using Ruby normally.

Background to this story:

The idea of freezing string literals was planed to introduce in Ruby 3.0, and there was a lot of discussion about it in Ruby 2.x era.

The main reasons for this are thought to be (1) there are performance benefits because if string literals are frozen, the same string can be returned at that point, and (2) it is easier to program if they are frozen as much as possible, but I think the main reason you hear about it is (1).

However, it has not been introduced so far because the disadvantages, such as incompatibility, are greater than the benefits. However, there are many people who like this feature, and there are many people who write # frozen_string_literal: true at the beginning of their scripts. It is also often inserted automatically with editors support.

Therefore, if everyone is writing it in the first place, why not make this behavior the default and not have to write this troublesome pragma? This is the reason why we are trying to make it the default again this time.

By the way, according to the comments on Feature #20205: Enable frozen_string_literal by default - Ruby master - Ruby Issue Tracking System, the following percentage of the files in the source code of published RubyGems specify # frozen-string-literal: true.

Number of gems that have `# frozen-string-literal: true` in all .rb files

  among all public gems: 14254 / 175170 (8.14%)
  per-file basis: 445132 / 3051922 (14.59%)

  among public gems with the latest version released in 2015-present: 14208 / 101904 (13.94%)
  per-file basis: 443460 / 1952225 (22.72%)

  among public gems with the latest version released in 2020-present: 11974 / 41205 (29.06%)
  per-file basis: 389559 / 1136445 (34.28%)

  among public gems with the latest version released in 2022-present: 9121 / 26721 (34.13%)
  per-file basis: 329742 / 848061 (38.88%)

About 15% overall. It seems that it is included in about 35% of the files that have been released recently (since 2022). How about it? Was it a lot? Was it a little?

Up until now, it was easy to prepare a string (buffer) that could be changed casually like s = ""; s << s2, but now it requires a little more effort, such as s = "".dup; s << s2 or s = +""; s << s2, so I think it will be a fairly big change in Ruby's mindset.

But I guess it's not that big a deal for people who have written # frozen_string_literal: true?

By the way, --enable-frozen-string-literal was introduced when we were discussing Ruby 2.x in order to test what would happen if string literals were frozen by default (I wrote a patch a long time ago). The --disable-frozen-string-literal option, which was introduced at the same time, has finally seen the light of day.

(ko1)

Incidentally, I really hates this change. Ruby objects should be mutable in principle.

(mame)

The conditions for whether String#+@ dups have changed

  • String#+@ now duplicates when mutating the string would emit a deprecation warning, offered as a replacement for the str.dup if str.frozen? pattern.

This is on the heels of defaulting to frozen_string_literal.

Until now, String#+@, i.e. +"foo" would do nothing if the string was mutable, but it would now duplicate the string if it was a mutable string and would warn about destructive operations (== a chilled string).

a = "str"
b = +a
p a.equal?(b) #=> false, i.e. it is a different object

c = "str".dup # No warning is given if you make a destructive change to c
d = +c
p c.equal?(d) #=> true, i.e. the same object without dup

This means that the only behaviour that used to be like for c untile now, is now like for a.

In other words, if you see the warning in the example, you can now decide to add +.

(ko1)

Keyword arguments ** can now be passed nil

  • Keyword splatting nil when calling methods is now supported. **nil is treated similarly to **{}, passing no keywords, and not calling any conversion methods. [Bug #20064]

It is now possible to pass nil to ** to pass a hash as a keyword argument.

def hello(name: "default")
  puts "Hello, #{ name }"
end

# Pass a hash as a keyword argument
h = { name: "mame" }
foo(**h) #=> Hello, mame

# If you pass nil instead of a hash, it is the same as specifying an empty hash
h = nil
foo(**h) #=> Hello, default

This was allowed because Matz was persuaded by the following code example.

h = if params.key?(:name)
  { name: params[:name] }
end

User.new(**h)

Personally, I wouldn't recommend using the nil implicitly returned by if.

(mame)

Block passing is no longer allowed in index assignment

  • Block passing is no longer allowed in index assignment (e.g. a[0, &b] = 1). [Bug #19918]

I don't think anyone would do this, but it was previously possible to pass a block as a[0, &b] = 1.
If you were to do it properly, there were a lot of things to consider, such as the order of evaluation.

(ko1)

Keyword arguments are no longer allowed in index assignment (e.g. a[0, kw: 1] = 2).

  • Keyword arguments are no longer allowed in index assignment (e.g. a[0, kw: 1] = 2). [Bug #20218]

The same applies here, but the way of writing a[0, kw: 1] = 2 is now prohibited.
It seems that there were people actually using it, but it seems that it has now been prohibited.

(ko1)

The toplevel constant Ruby is reserved

  • The toplevel name ::Ruby is reserved now, and the definition will be warned when Warning[:deprecated]. [Feature #20884]

The toplevel constant Ruby has been reserved. Defining it will result in a warning.

# Run with `ruby -w`

# A warning is given when you define a Ruby module yourself
module Ruby #=> warning: ::Ruby is reserved for Ruby 3.5
end

# A warning is also given when you assign to the Ruby constant
Ruby = 1 #=> warning: ::Ruby is reserved for Ruby 3.5

In Ruby 3.5, the Ruby module will be defined, and it is planned to contain meta-functions related to Ruby that should be provided commonly by Ruby implementations. What will be placed in the module has not yet been decided, but it is likely that Ruby::VERSION will be defined as an alias for RUBY_VERSION.

(mame)

Update of built-in classes

Array#fetch_values was added

Since Hash#fetch_values exists, Array#fetch_values is also added.

["foo", "bar", "baz"].fetch_values(1, 2)         #=> ["bar", "baz"]
["foo", "bar", "baz"].fetch_values(1, 4)         #=> index 4 outside of array bounds: -3...3 (IndexError)
["foo", "bar", "baz"].fetch_values(1, 4) { 42 }  #=> ["bar", 42]

It seems that Array#values_at returns nil for out-of-bounds access, whereas Array#fetch_values raises an IndexError (or calls the block if given).

(mame)

Exception#set_backtrace now accepts arrays of Thread::Backtrace::Location

  • Exception#set_backtrace now accepts arrays of Thread::Backtrace::Location. Kernel#raise, Thread#raise and Fiber#raise also accept this new format. [Feature #13557]

The background explanation is very long. This is not a feature that we want to be used very much, so please only read it if you really want to.

First of all, Ruby exception backtraces can be programmatically retrieved using the Exception#backtrace method.

def foo
  raise RuntimeError, "exception"
end

begin
  foo
rescue
  pp $!.backtrace #=> ["test.rb:2:in 'Object#foo'", "test.rb:6:in '<main>'"]
end

However, if you wanted to extract just the file name from each line of the backtrace, you had to use regular expressions to extract it. That's why the Exception#backtrace_locations method was introduced. This returns an array of Thread::Backtrace::Location instances, so you can easily extract the file name, line number, etc.

def foo
  raise RuntimeError, "exception"
end

begin
  foo
rescue
  loc = $!.backtrace_locations.first

  # Looks like a string, but is actually an instance of Thread::Backtrace::Location
  p loc        #=> "test.rb:2:in 'Object#foo'"
  p loc.class  #=> Thread::Backtrace::Location

  # You can easily get the file name and line number
  p loc.path   #=> "test.rb"
  p loc.lineno #=> 2
end

You can also get the backtrace at the time of the call using Kernel#caller_locations.

def foo
  loc = caller_locations.first
  p loc         #=> "test.rb:8:in '<main>'"
  p loc.path    #=> "test.rb"
  p loc.lineno  #=> 8
end

foo

By the way, you can replace the exception backtrace with the third argument of raise.

def foo
  raise RuntimeError, "exception", ["foo", "bar"]
end

begin
  foo
rescue
  pp $!.backtrace #=> ["foo", "bar"]
  pp $!.backtrace_locations #=> nil
end

When the backtrace is specified as an array of strings like this, the interpreter cannot know the actual file name or line number, so it cannot create Thread::Backtrace::Location. Therefore, Exception#backtrace_locations returns nil.

Now, what happens if you specify caller_locations instead of an array of strings as the third argument to raise?

# Ruby 3.3
def foo
  raise RuntimeError, "exception", caller_locations
    #=> in `set_backtrace': backtrace must be Array of String (TypeError)
end

foo

This raises an error because there is a validation in Exception#set_backtrace that backtrace must be an array of strings.

Here's the change we're making. Exception#set_backtrace now accepts an array of Thread::Backtrace::Location.

# Ruby 3.4
def foo
  raise RuntimeError, "exception", caller_locations
end

begin
  foo
rescue
  pp $!.backtrace_locations  #=> ["test.rb:6:in '<main>'"]
end

It seems that you can use this when you want to cut the backtrace or copy it from another exception. However, as I explained at length, if you try to fake the backtrace, it will make debugging more difficult, so I think it's better not to do it.

(mame)

A mechanism for offloading time-consuming operations using the Fiber Scheduler has been introduced

  • An optional Fiber::Scheduler#blocking_operation_wait hook allows blocking operations to be moved out of the event loop in order to reduce latency and improve multi-core processor utilization. [Feature #20876]

The Fiber scheduler can now accept a blocking_operation_wait callback. This callback is provided to allow the Fiber scheduler to create a separate thread to offload time-consuming operations that attempt to release the GVL, such as zlib processing.

In practice, I think it would be difficult to use, so I think it would be better to leave it to the framework.

(ko1)

IO::Buffer#copy now releases the GVL

  • IO::Buffer#copy can release the GVL, allowing other threads to run while copying data. [Feature #20902]

When executing IO::Buffer#copy, the GVL is now released, allowing other threads to run. This can also be handled by blocking_operation_wait we explained before.

(ko1)

GC.config added to allow accessing configurations on the Garbage Collector & new control over generational GC

  • GC.config added to allow setting configuration variables * GC.config added to allow setting configuration variables on the Garbage Collector. [Feature #20443]
  • GC configuration parameter rgengc_allow_full_mark introduced. When false GC will only mark young objects. Default is true. [Feature #20443]

GC has various settings, and the GC.config method has been added to access those settings.

$ ruby -e 'pp GC.config'
{rgengc_allow_full_mark: true, implementation: “default”}

You can change the settings like this.

$ ruby -e 'p GC.config(rgengc_allow_full_mark: false)'
{rgengc_allow_full_mark: false}

The changed settings are returned. Hmm, I wonder if it's okay not to return the values before the change...

Currently, there are two types of settings that can be taken: the rgengc_allow_full_mark setting and the implementation setting. I think that the number of these settings will continue to increase in the future.

The implementation setting, which we will introduce later in this article, is being changed so that the GC implementation can be replaced, and the name of the GC implementation is entered here (of course, it is read-only). The example value is “default”, so we can see that it is the same GC implementation as before.

The rgengc_allow_full_mark setting determines whether or not to allow time-consuming major GC to target objects in older generations in generational GC. The default is true.

If you set this to false, major GC will not occur even if there are many objects in the old generation (more precisely, the old objects are not incremented while the configuration is false). This was introduced to create a modern version of the old OoB GC. In other words, it is now possible to control the system so that major GC, which takes time, is prohibited while requests are processed, and perform major GC when there is time to spare. I don't know much about it, but I think it will be included in Rails (this feature was included based on a discussion about making OOB GC the default in Ruby on Rails (Run GC out-of-band by default · Issue #50449 · rails/rails).

Let's try it out a bit.

$ ruby -e 'GC.config(rgengc_allow_full_mark: false)
           pp GC.stat;
           a = []
           10_000_000.times{ a << [] }
           pp GC.stat'
{count: 4,
 minor_gc_count: 2,
 major_gc_count: 2,
 old_objects: 12017,
 (omitted)
}
{count: 17,
 minor_gc_count: 16,
 major_gc_count: 2,
 old_objects: 12017,
 (omitted)
}

major_gc_count, that is, we can see that a major GC has not occurred. Also, since the number of old_objects has not changed, we can see that the objects did not become old-generation objects in the first place.

If you check GC.latest_gc_info(:needs_major_by), you can see whether a major GC is needed now (if it's truthy, it wants to do a major GC).

It's a difficult feature to handle, so it's probably best to leave it to the framework.

By the way, the GC.config method itself was introduced as a general GC.config method because it was thought that more settings would be added, based on discussions about how to handle the rgengc_allow_full_mark setting.

(ko1)

It is now possible to specify the capacity required when generating a hash

  • Hash.new now accepts an optional capacity: argument, to preallocate the hash with a given capacity. This can improve performance when building large hashes incrementally by saving on reallocation and rehashing of keys. [Feature #19236]
h = Hash.new
10_000_000.times{ h[it] = it }

When creating a hash with 10 million elements, this program allocates memory during iteration, but if you know that you're going to create 10 million elements from the start, it seems more efficient to allocate the memory all at once at the start.

So, we've changed it so that Hash.new(capacity: n) allocates memory for n elements at the start. This is likely to be faster than allocating memory a little at a time. Let's try it.

$ time ruby -e 'n = 10_000_000; h = Hash.new; n.times{ h[it] = it }'

real    0m2.842s
user    0m2.485s
sys     0m0.294s

$ time ruby -e 'n = 10_000_000; h = Hash.new(capacity: n); n.times{ h[it] = it }'

real    0m2.122s
user    0m1.996s
sys     0m0.072s

capacity: makes it a little faster.

To be honest, I don't think there are many cases where this would be effective, but if you find one, please give it a try.

(ko1)

Integer multiplication no longer returns Float::INFINITY

  • Integer#** used to return Float::INFINITY when the return value is large, but now returns an Integer. If the return value is extremely large, it raises an exception. [Feature #20811]

  • Rational#** used to return Float::INFINITY or Float::NAN when the numerator of the return value is large, but now returns a Rational. If it is extremely large, it raises an exception. [Feature #20811]

When the result of an integer power of an integer is a very large number, Ruby 3.3 and earlier would return Float::INFINITY, but from Ruby 3.4, it will now calculate the result directly, even if it takes a long time.

# Ruby 3.3: Returns Float::INFINITY while also issuing a warning "warning: in a**b, b may be too big"
p 10**10000000 #=> Infinity

# Ruby 3.4: Calculates as is
p 10**10000000 #=> 10000000000000000000000000000.....

The trigger of this change was the news that a new largest prime number had been discovered in October this year. Endoh tried to calculate it in Ruby by executing 2**136279841-1, but it returned Float::INFINITY. A disappointing experience. In fact, I had the same disappointing experience six years ago when the previous largest prime number was discovered, and I thought I would be disappointed again when the next largest prime number is discovered if nothing was done, so I proposed that we change it.

The original behavior was probably to avoid time-consuming calculations, but it was difficult to imagine a situation where INFINITY would be returned and be a happy situation, and also, with the use of GMP, even large prime numbers are not so slow anymore (it takes about 1 second to calculate 2**136279841-1 on my laptop, and about 10 seconds for the radix conversion for display), so it was decided that calculations could be done as they were.

Also, Rational integer powers have changed in the same way.

(mame)

A byte-based version of MatchData#begin has been introduced

  • MatchData#bytebegin and MatchData#byteend have been added. [Feature #20576]

MatchData#bytebegin has been introduced as a byte-based version of MatchData#begin. The same applies to #byteend.

"あいう" =~ /い/

# The position of “い” is the first character...second character
p $~.begin(0) #=> 1
p $~.end(0)   #=> 2

# “い” is at bytes 3...6
p $~.bytebegin(0) #=> 3
p $~.byteend(0)   #=> 6

(mame)

Object#singleton_method now sees modules extended

  • Object#singleton_method now returns methods in modules prepended to or included in the receiver's singleton class. [Bug #20620]
o = Object.new
o.extend(Module.new{def a = 1})
o.singleton_method(:a).call #=> 1
# Previously, this raised an error: singleton_method': undefined singleton method `a'

It seems that Object#singleton_method did not return the method of the extended module before, but it now works properly.

Ractor can now require

  • require in Ractor is allowed. The requiring process will be run on the main Ractor. Ractor._require(feature) is added to run requiring process on the main Ractor. [Feature #20627]

It was not possible to require on Ractor until now. This is because

  1. it is unclear how to manage state such as $LOADED_FEATURES
  2. almost all libraries are expected to be executed on the main Ractor

For toy-like applications, it's fine to require all libraries before using Ractor, but as the scale increases, things like require-ing in methods (e.g. the pp library is required when the pp method is first executed) and the existence of autoload become major problems.

So, we changed the process so that require is done in the main Ractor when require is called in another Ractor.

However, there are various libraries in the world that redefine require. In such cases, you will need to determine whether you are in the main Ractor yourself, and if not, use the Ractor._require(feature) method to execute require in the main Ractor. Well, I don't think any of our readers will be redefining require like that, so I don't think you need to remember this. Right?

Ractor.main? has been introduced as a convenient method for determining whether the currently running Ractor is the main Ractor.

When defining your own require, please use a combination of these methods, as follows.

  def require(feature)
    if !Ractor.main?
      Ractor._require(feature)
    else
      # do something with your original require process
    end
  end

Incidentally, when redefining Kernel#require, an anonymous module that behaves as described above is automatically prepended to Kernel when Ractor.new is first called.

      Kernel.prepend Module.new{|m|
        m.set_temporary_name '<RactorRequire>'

        def require feature
          if Ractor.main?
            super
          else
            Ractor._require feature
          end
        end
      }

So, for example, I think that Kernel#require as defined by rubygems can be used without any changes, but if you want to extend it with prepend or define Object#require on your own, you will still need to use Ractor._require. Well, you wouldn't do that, right?

(ko1)

More ways to access the Ractor local storage

  • Ractor.[] and Ractor.[]= are added to access the ractor local storage of the current Ractor. [Feature #20715]

In order to access the area prepared for each Ractor (Ractor local storage), Ractor#[key]/[key]= was prepared, but in fact, even if the receiver was another Ractor, it was accessing the local storage of its own Ractor. Without warning.

So, it was decided that a class method would be sufficient, and Ractor.[]/[]= was introduced. It would be a problem if it were possible to access another Ractor. However, since it might be used for error handling, etc., it seems that the current Ractor#[] will remain with an ambiguous specification.

  • Ractor.store_if_absent(key){ init } is added to initialize ractor local variables in thread-safty. [Feature #20875]

Ractor local storage is a kind of global variable space (a global variable space that is different for each Ractor).

Now, when you want to set a value for each Ractor, it is not good if a multi-threaded program is running in the Ractor.

Ractor[:cnt] ||= 0
m = (Ractor[:m] ||= Mutex.new)
m.synchronize do
  Ractor[:cnt] += 1
end

At first glance, this program looks fine, but if you run this part in a multi-threaded environment, there is a possibility that m will take a different Mutex each time. There is also a possibility that :cnt will be initialized in an unintended way.

Therefore, a method called Ractor.store_if_absent(key){ init } was introduced to perform the setting atomically. When the key corresponding to the key has not yet been initialized, the value obtained by executing the block is set as the value for that key. Since these initializations are performed atomically, they will not be disturbed by other threads.

The program mentioned above can be made thread-safe by doing the following.

m = Ractor.store_if_absent(:m){ Ractor[:cnt] = 0; Mutex.new }
m.synchronize do
  Ractor[:cnt] += 1
end

Specifically, I've added timeout.rb to make it Ractor-compatible for now.

(ko1)

Range#size now raises TypeError if the range is not iterable.

  • Range#size now raises TypeError if the range is not iterable. [Misc #18984]

Range#size now raises TypeError if the range is not iterable, such as (0.1 ... 1).

(0.1 ... 1).size  #=> 'Range#size': can't iterate from Float (TypeError)
# Until Ruby 3.3, it returned 1

The same applies to beginless ranges such as (...0).

Range#step semantics have changed slightly

  • Range#step now consistently has a semantics of iterating by using + operator for all types, not only numerics. [Feature #18368]

In simple terms, Range#step can now be used with Time ranges.

pp (Time.new(2000, 1, 1)...).step(86400).take(3)
# => [2000-01-01 00:00:00 +0900,
# 2000-01-02 00:00:00 +0900,
# 2000-01-03 00:00:00 +0900]

Date and other ranges also work as expected.

In actual behavior, Range#step repeatedly adds the step argument to the Range#begin value using the + method, and stops when it reaches Range#end. Until now, it was built in accordance with the values in Range#begin and Range#end, so it could be said that the meaning has been clarified (however, there are still exceptions, such as the fact that the + method is not actually called for Integer ranges, and the way String ranges are handled).

Now it works without having to deal with Time, Date, etc. individually, but as a side effect, it behaves a little strangely with objects for which the “+” method doesn't seem to work like “addition”. I don't think you need to be careful about this, but just in case.

pp ([]...).step([1]).take(10)
[[],
[1],
[1, 1],
[1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1]]

(mame)

RubyVM::AbstractSyntaxTree::Node#locations was introduced

  • Add RubyVM::AbstractSyntaxTree::Node#locations method which returns location objects associated with the AST node. [Feature #20624]
  • Add RubyVM::AbstractSyntaxTree::Location class which holds location information. [Feature #20624]

The #locations method, which returns location information in the code of RubyVM::AbstractSyntaxTree::Node, has been introduced.

# Get location information from the node obtained by parsing “1 + 1”.
loc = RubyVM::AbstractSyntaxTree.parse("1 + 1").locations.first
=> #<RubyVM::AbstractSyntaxTree::Location:@1:0-1:5>

# We can see that it extends from the 0th column of the 1st line to the 5th column of the 1st line
p loc.first_lineno #=> 1
p loc.first_column #=> 0
p loc.last_lineno  #=> 1
p loc.last_column  #=> 5

#locations returns multiple location information depending on the node. For example, the node of the code alias foo bar seems to return two location information: the overall location information and the location information of the “alias” keyword. Since the Prism node already has this kind of location information, it is a preparation for creating a conversion from RubyVM::AbstractSyntaxTree::Node to the Prism node.

(mame)

String#append_as_bytes was introduced

  • String#append_as_bytes was added to more easily and efficiently work with binary buffers and protocols. It directly concatenate the arguments into the string without any encoding validation or conversion. [Feature #20594]

The String#append_as_bytes method, which destructively concatenates strings, was introduced.

str = "".b
str.append_as_bytes(255, "ä")

# This will result in a byte string equivalent to "\xff" + "ä".
p str #=> "\xFF\xC3\xA4"

What is the difference between this and str << 255 << "ä"? In the following case, an exception is raised.

str = "".b
str << 255 << "ä"
  #=> incompatible character encodings: BINARY (ASCII-8BIT) and UTF-8 (Encoding::CompatibilityError)

You might think that str is a binary encoding, so it should be silently concatenated. However, that means implicitly converting the argument string "ä" into a broken string, which may not be a good idea. So it was decided to introduce a more explicit method with the name as_bytes.

Incidentally, String#append_as_bytes forcibly concatenates the receiver as a byte string, regardless of the encoding, not just binary encoding. I'm not sure about the use case for that, though.

(mame)

Symbol#to_s now emits a deprecation warning when mutated

  • The string returned by Symbol#to_s now emits a deprecation warning when mutated, and will be frozen in a future version of Ruby. These warnings can be enabled with -W:deprecated or by setting Warning[:deprecated] = true. [Feature #20350]
$ ruby -we ':sym.to_s << ?a'
-e:1: warning: warning: string returned by :sym.to_s will be frozen in the future

At the beginning of this article, we introduced the mechanism for making frozen-string-literal the default (a string that will be frozen in the future if a destructive change is made to it), and if you try to make a destructive change to the string returned by Symbol#to_s using this, you will now get a warning if the deprecated warning is enabled (for example, if you start Ruby with -w or -W:deprecated).

So I guess they want to freeze this return value in the future too. Symbol#name went in for that, but should we have to do #to_s too?

(ko1)

On Windows, the string encoding of Time#zone has changed slightly

  • On Windows, now Time#zone encodes the system timezone name in UTF-8 instead of the active code page, if it contains non-ASCII characters. [Bug #20929]

I don't really understand it, but it seems to have changed. The Windows native environment is difficult to me.

(mame)

Time#xmlschema has been incorporated into the main body

  • Time#xmlschema, and its Time#iso8601 alias have been moved into the core Time class while previously it was an extension provided by the time gem. [Feature #20707]

The Time#xmlschema and Time#iso8601 provided by the time gem have been incorporated into the main Ruby interpreter.

Time.new(2024, 12, 25).xmlschema #=> “2024-12-25T00:00:00+09:00”

The reason for this is that the C implementation is faster. That's right.

(mame)

Warning.categories has been added

  • Add Warning.categories method which returns a list of possible warning categories. [Feature #20293]

Recently introduced warnings have categories, and it is possible to turn the output on and off for each category, but since the number of categories is gradually increasing, a method to obtain a list of them has been introduced.

# Turn on warnings about deprecation
Warning[:deprecated] = true

# Returns a list of categories that can be specified.
Warning.categories
#=> [:deprecated, :experimental, :performance, :strict_unused_block]

The main use is for testing Ruby itself. When running certain tests, we want to suppress warnings, so we save the status of all categories, suppress them, and then restore them later. I don't think there are any other uses.

(mame)

stdlib updates

There are various updates, but only the ones that are commented on in the NEWS article.

RubyGems now supports sigstore.dev

  • Add --attestation option to gem push. It enabled to store signature of build artifact to sigstore.dev.

RubyGems now supports sigstore.dev, which aims to improve the security of the software supply chain.
Sigstore is a series of mechanisms that provide automated signing for the software supply chain. If you pass the file path to a Sigstore Bundle generated using cosign or sigstore-ruby to --attestation, you can upload the Gem signature to RubyGems.

(atpons)

Support for checksums in Bundler

  • Add a lockfile_checksums configuration to include checksums in fresh lockfiles.
  • Add bundle lock --add-checksums to add checksums to an existing lockfile.

It is now possible to add checksums to Gems using RubyGems. By adding checksums to Gems when the lockfile is generated, and verifying them when bundle install is run using that lockfile, the consistency of libraries in each execution environment can be guaranteed, and tampering by the library provider or man-in-the-middle attacks can be detected.

(atpons)

JSON performance improvements

  • Performance improvements JSON.parse about 1.5 times faster than json-2.7.x.

It seems that JSON processing has become much faster.
For more details, see here.

(ko1)

Now possible to create a Tempfile with no actual file

  • The keyword argument anonymous: true is implemented for Tempfile.create. Tempfile.create(anonymous: true) removes the created temporary file immediately. So applications don't need to remove the file. [Feature #20497]

Tempfile.create(anonymous: true) now creates a temporary file without creating a real file.
To be precise, it seems that on Linux, where O_TMPFILE is available, a temporary file is not created, but on other platforms, including Windows, a temporary file is created for a moment and then immediately deleted after the IO handle is obtained.

I'm not sure what the user benefits are, but perhaps it avoids problems such as the temporary file not being deleted if the Ruby interpreter terminates abnormally due to a segfault, or the filename generated by the random number generator colliding with an existing file (a problem that seems unlikely enough not to be a concern).

(mame)

win32/sspi.rb has been removed from the Ruby repository

  • This library is now extracted from the Ruby repository to [ruby/net-http-sspi]. [Feature #20775]

I don't know what this is, but it seems to have been removed. I think it wasn't working in the first place?

(ko1)

The default gems have been updated in various ways

win32-registry 0.1.0 has been added, and many other libraries have been updated.

The bundled gems have been updated in various ways

They have been updated in various ways.

repl_type_completor has been added

The following bundled gem has been added.

  • repl_type_completor 0.1.9

The name of katakata_irb, which was announced at RubyKaigi 2023, has changed and it is now a standard attachment (bundled gem). This is a library that supports completion in IRB, and it uses Prism and RBS to provide smarter completion candidates than before. By default, IRB analyzes completion targets using regular expressions , so there were problems such as it sometimes suggesting methods that could not be called, but repl_type_completor now makes it possible to suggest more accurate candidates. Even if you haven't written RBS, it will automatically become smarter because RBS is included as standard for built-in classes and the analysis function itself has become more powerful.
In IRB, repl_type_completor is used preferentially if it is present in the execution environment. In the Bundler environment, bundle add repl_type_completor is required.

(ima1zumi)

Transfer from default gems to bundled gems

The following bundled gems are promoted from default gems.

These libraries have been changed from default gems to bundled gems.
If you are developing an app using Gemfile, please add these libraries to your Gemfile.

(ko1)

Compatibility issues

Error messages and backtrace have been changed

  • Error messages and backtrace displays have been changed.
    • Use a single quote instead of a backtick as an opening quote. [Feature #16495]
    • Display a class name before a method name (only when the class has a permanent name). [Feature #19117]
    • Extra rescue/ensure frames are no longer available on the backtrace. [Feature #20275]
    • Kernel#caller, Thread::Backtrace::Location’s methods, etc. are also changed accordingly.

It's not very flashy, but there have been a lot of changes. Spot the differences.

Old example:

test.rb:1:in `foo': undefined method `time' for an instance of Integer
from test.rb:2:in `<main>'

New example:

test.rb:1:in 'Object#foo': undefined method 'time' for an instance of Integer
from test.rb:2:in '<main>'

For one thing, the method name display has changed from foo to Object#foo. It's now easier to see which method it is (on the other hand, it's also possible that it's become more verbose, so if you find yourself complaining a lot, it might be a good idea to speak up now).
The other thing is that backticks have been changed to single quotes. This improves the experience of copying and pasting error messages and having them turn out weird in markdown.

I was the one who made this change, but even this small change had an unexpected impact on compatibility, and it was a lot of work. There were some naughty libraries that parsed error messages and backtraces. For example, irb and test-unit. I don't think it will affect the majority of libraries, but if there is any problem, please feel free to report it so we can investigate.

(mame)

Hash#inspect results have changed

  • Hash#inspect rendering have been changed. [Bug #20433]

    • Symbol keys are displayed using the modern symbol key syntax: "{user: 1}"
    • Other keys now have spaces around =>: '{"user" => 1}', while previously they didn't: '{"user"=>1}'

The appearance of a p (inspect) of a Hash has changed.

h = { user: 1 }
p h #=> {:user=>1} until Ruby 3.3
    #=> {user: 1} from Ruby 3.4

I think that very few people write hashes like { :user => 1 } these days, so this change is to match that.

This is probably the biggest incompatibility in Ruby 3.4, apart from the warning for the default frozen_string_literal. If you have the following code in your tests, it will fail.

str = "result: #{ h }"

expect(str).to eq("result: {:user=>1}")

If you only want to use it with Ruby 3.4 or later, you can just change the expected value, but if you want to use it with both Ruby 3.3 and 3.4, for example in testing a library, you might want to change it like this

expect(str).to eq("result: #{ { user: 1 } }")

Well, since the inspect string may change in this way, it is not recommended to use it as the expected value in a test.

Incidentally, for a Hash that contains a mixture of Symbol and non-Symbol keys, only the Symbol keys are displayed with a colon.

h = { :sym => 1, "str" => 2 }

p h #=> {sym: 1, "str" => 2}

As an aside, the circumstances leading up to this change are a little interesting. It all started with a bug report saying that {:<=>1} was being output as {:<=>1} when inspect was used, and that it didn't revert back to the original value when eval was used. There is no guarantee that the result of inspect will revert back to the original value when eval is used, but {:<=>1} is not very clear to the eye, so we wanted to fix it. So, it was considered that spaces should be inserted, like {:< => 1}. Then, {:key=>1} should also be changed to {:key => 1}? I think that {:< => 1} will not affect most people, but changing to {:key => 1} is a certain degree of incompatibility. There was also the option of giving the change up, but there had been talk of “making Hash#inspect into the {key: 1} format someday” that had been smoldering and then fading away, and the momentum grew that “if it's going to be incompatible anyway, let's do it all at once here”, and that's how this change came about.

(mame)

Converting a decimal number to a string now interprets a string without a decimal part

  • Kernel#Float() now accepts a decimal string with decimal part omitted. [Feature #20705]

  • String#to_f now accepts a decimal string with decimal part omitted. [Feature #20705]
    Note that the result changes when an exponent is specified.

In short, it now interprets half-decimal strings like "1.".

Float("1.") #=> 1

Previously, Kernel#Float would throw an exception for such strings. String#to_f evaluated the string within the range of interpretation (ignoring the part after the period in the example above), so the behavior is the same.

The fractional part between the period and the exponent notation can also be omitted.

Float("1.E-1") #=> 0.1

This was also an exception for Kernel#Float in the past. String#to_f ignored everything after the period, so it returned 1, but in Ruby 3.4 and later, it returns 0.1, so it is slightly incompatible. I hope no one will use it.

Incidentally, there are quite a few languages that interpret 1. as a decimal number, including C and Python. Of course, in the Ruby language, the period is a method call operator, so 1. cannot be interpreted as a decimal number.

(mame)

Refinement#refined_class has been removed

  • Removed deprecated method Refinement#refined_class. [Feature #19714]

It has been removed. Please use Refinement#target from now on.

(mame)

Stdlib incompatibilities

There also appear to be some changes in the standard library.

  • DidYouMean

    • DidYouMean::SPELL_CHECKERS[]= and DidYouMean::SPELL_CHECKERS.merge! are removed.
  • Net::HTTP

    • Removed the following deprecated constants: Net::HTTP::ProxyMod Net::NetPrivate::HTTPRequest Net::HTTPInformationCode Net::HTTPSuccessCode Net::HTTPRedirectionCode Net::HTTPRetriableCode Net::HTTPClientErrorCode Net::HTTPFatalErrorCode Net::HTTPServerErrorCode Net::HTTPResponseReceiver Net::HTTPResponceReceiver

    These constants were deprecated from 2012.

  • Timeout

    • Reject negative values for Timeout.timeout. [Bug #20795]
  • URI

    • Switched default parser to RFC 3986 compliant from RFC 2396 compliant. [Bug #19266]

C API updates

  • rb_newobj and rb_newobj_of (and corresponding macros RB_NEWOBJ, RB_NEWOBJ_OF, NEWOBJ, NEWOBJ_OF) have been removed. [Feature #20265]

These C APIs have been removed.

This C API has been removed. It was previously a no-op, so if you're using it, just remove it.

To begin with, what this did was to manually free objects that were no longer needed before the GC ran. However, it was very difficult to use, and if it was used on a live object by mistake, it would cause strange things to happen, and it would require special processing for the GC, so recently it was more of a nuisance than anything else. So, it's a good thing it's been removed.

(ko1)

Implementation improvements

Prism is now the default parser

  • The default parser is now Prism. To use the conventional parser, use the command-line argument --parser=parse.y. [Feature #20564]

Prism is now the default parser for Ruby.

For the past few years, the Ruby parser has been a very hot topic at RubyKaigi and other events, but even so, it's just an internal improvement, so I don't think there's anything for ordinary users to be aware of. If anything, the error messages for syntax errors have changed.

$ ruby -e 'foo('
-e: -e:1: syntax error found (SyntaxError)
> 1 | foo(
    |     ^ unexpected end-of-input; expected a `)` to close the arguments

Is it convenient to have snippets? It used to be like this

$ ruby --parser=parse.y -e 'foo('
-e:1: syntax error, unexpected end-of-input, expecting ')'
ruby: compile error (SyntaxError)

Also, there are still some incompatibilities in Prism in corner cases (and I'm sure there will be more found in the future), so there may be some changes here and there.

(mame)

Happy Eyeballs v2 has been implemented

  • Happy Eyeballs version 2 (RFC8305), an algorithm that ensures faster and more reliable connections by attempting IPv6 and IPv4 concurrently, is used in Socket.tcp and TCPSocket.new. To disable it globally, set the environment variable RUBY_TCP_NO_FAST_FALLBACK=1 or call Socket.tcp_fast_fallback=false. Or to disable it on a per-method basis, use the keyword argument fast_fallback: false. [Feature #20108] [Feature #20782]

TCPSocket.new and Socket.tcp now connect by default using Happy Eyeballs v2. Happy Eyeballs v2 is a mechanism that tries to connect to a specified host using both IPv4 and IPv6 simultaneously, and establishes a connection with the first host that responds.

Previously, it tried to connect to IPv6 and IPv4 sequentially. With this method, if you tried to connect to IPv6 first in an environment where there were problems with IPv6 communication for some reason, it would take a long time because it wouldn't switch to trying to connect to IPv4 until the IPv6 connection timed out. Ruby 3.4 connects to IPv6 and IPv4 simultaneously, so even if IPv6 doesn't work, IPv4 will respond immediately and the connection should be established without waiting for a timeout.

Well, actually, I personally feel that it is more of a transitional measure needed for society to gradually shift from IPv4 to IPv6, than a direct user benefit. Happy Eyeballs is a social responsibility. Ruby is doing a great job of fulfilling its social responsibilities.

Happy Eyeballs has a slight overhead because it connects in parallel. If you really want to stop Happy Eyeballs, set Socket.tcp_fast_fallback = false or set the environment variable RUBY_TCP_NO_FAST_FALLBACK=1. In addition to the overhead, it may also be useful for isolating problems when connection problems occur.

(mame)

A mechanism for switching GC implementations has been introduced

  • Alternative garbage collector (GC) implementations can be loaded dynamically through the modular garbage collector feature. To enable this feature, configure Ruby with --with-modular-gc at build time. GC libraries can be loaded at runtime using the environment variable RUBY_GC_LIBRARY. [Feature #20351]

It is now possible to build Ruby so that the GC implementation can be switched at Ruby process startup by specifying --with-modular-gc in configure when building Ruby. This can be changed using the RUBY_GC_LIBRARY environment variable.

Since this option is not specified by default when building, it can be said that this was introduced for the purpose of experimenting with replacing the GC implementation.

  • Ruby's built-in garbage collector has been split into a separate file at gc/default/default.c and interacts with Ruby using an API defined in gc/gc_impl.h. The built-in garbage collector can now also be built as a library using make modular-gc MODULAR_GC=default and enabled using the environment variable RUBY_GC_LIBRARY=default. [Feature #20470]

The directory structure related to GC has been changed so that the GC implementation can be changed in various ways.

  • An experimental GC library is provided based on MMTk. This GC library can be built using make modular-gc MODULAR_GC=mmtk and enabled using the environment variable RUBY_GC_LIBRARY=mmtk. This requires the Rust toolchain on the build machine. [Feature #20860]

Using the feature to change the GC implementation, an experimental GC implementation called MMTk, which is a collection of many GC implementations, was provided. MMTk contains GC implementations that are the culmination of research, but due to limitations in CRuby, only a very simple one is currently supported, and it is said that it will be developed in the future. It's written in Rust, isn't it?

(ko1)

YJIT

New features

  • Add unified memory limit via --yjit-mem-size command-line option (default 128MiB) which tracks total YJIT memory usage and is more intuitive than the old --yjit-exec-mem-size.
  • More statistics now always available via RubyVM::YJIT.runtime_stats
  • Add compilation log to track what gets compiled via --yjit-log
    • Tail of the log also available at run-time via RubyVM::YJIT.log
  • Add support for shareable consts in multi-ractor mode
  • Can now trace counted exits with --yjit-trace-exits=COUNTER

New optimizations

  • Compressed context reduces memory needed to store YJIT metadata
  • Improved allocator with ability to allocate registers for local variables
  • When YJIT is enabled, use more Core primitives written in Ruby:
    • Array#each, Array#select, Array#map rewritten in Ruby for better performance [Feature #20182].
  • Ability to inline small/trivial methods such as:
    • Empty methods
    • Methods returning a constant
    • Methods returning self
    • Methods directly returning an argument
  • Specialized codegen for many more runtime methods
  • Optimize String#getbyte, String#setbyte and other string methods
  • Optimize bitwise operations to speed up low-level bit/byte manipulation
  • Various other incremental optimizations

It seems that YJIT has also improved in various ways. I'm sure there will be a detailed explanation, so I'll skip it here.

(ko1)

Miscellaneous changes

Added a warning when passing a block to a method that doesn't use the passed block

  • Passing a block to a method which doesn't use the passed block will show a warning on verbose mode (-w). [Feature #15554]

When passing a block to a method that doesn't use the passed block, a warning will now be issued if -w is specified.

def f = nil
f{}
#=> warning: the block passed to 'Object#f' defined at t.rb:1 may be ignored

However, if you run this as it is, you will get a warning even if you are passing a block intentionally and unnecessarily, as in the following duck typing, and we found that this is quite common.

class C
  def f = yield # This uses a block
end

class D
  def f = nil # This one doesn't use a block
end

[C.new, D.new].each{ it.f{} }
#=> No warning is given because there is a method named “f” that uses a block

So, in this way, if there is a method with the same name that uses a block, no warning is given (even with -w).

If you don't like this loosening of the conditions, try running with -W:strict_unused_block.

$ ./miniruby -W:strict_unused_block -e '
class C
  def f = yield # This uses a block
end

class D
  def f = nil # This does not use a block
end

[C.new, D.new].each{ it.f{} }
'
-e:10: warning: the block passed to 'D#f' defined at -e:7 may be ignored

You will now get a warning.

In fact, when I tried it out with our app, I found one example in the rspec tests where a block was being passed where it shouldn't have been. I only found it because I was using the strict version, so you might want to be prepared for a lot of false positives.

(ko1)

Warnings are now given for redefining methods that we wouldn't expect to be redefined

  • Redefining some core methods that are specially optimized by the interpreter and JIT like String#freeze or Integer#+ now emits a performance class warning (-W:performance or Warning[:performance] = true). [Feature #20429]

If methods that are not supposed to be redefined, such as String#freeze and Integer#+, are redefined, it has a negative impact on the interpreter's performance (because the interpreter is optimized on the assumption that they will not be redefined). Therefore, if such a method is redefined, a warning will now be issued if -W:performance is specified.

(ko1)

Conclusion

We have introduced the new features and improvements of Ruby 3.4. In addition to the ones introduced here, there are also bug fixes and minor improvements. We hope you will check them out in your Ruby applications.

Although Ruby 3.4 doesn't have any particularly eye-catching new features, there have been a lot of implementation improvements, such as the replacement of the parser with Prism and the introduction of a new grammar called it. There have also been some decisions that will have a major impact on the future development of Ruby, such as the defaultization of frozen-string-literal. Please set it up on your computer and enjoy the new Ruby.

Enjoy Ruby programming!

(ko1/mame, guest: ima1zumi, atpons)

Featured ones: