登録: 21ヵ月前

完了: 9ヵ月前

#10858 closed task (fixed)

Smaller generated Ord instances

報告者: nomeata 担当者:
優先度: normal マイルストーン: 8.2.1
コンポーネント: Compiler バージョン: 7.10.2
キーワード: deriving-perf 関係者: erikd, RyanGlScott
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Compile-time performance bug Test Case:
Blocked By: Blocking:
Related Tickets: #9557 Differential Rev(s): Phab:D2502
Wiki Page:

詳細

With very large modules with lots of deriving code, GHC spends quite some time compiling. I believe that we can improve things by reducing the size of the generated code.

(More in the comments)

更新履歴 (21)

comment:1 更新者: nomeata (21ヵ月前)

キーワード: newcomer を追加

Consider this code generated for Ord for a product type:

  instance Ord l_acv3 => Ord (ImportDecl l_acv3) where
    compare a_acXz b_acXA
      = case a_acXz of {
          ImportDecl a1_acXB
                     a2_acXC
                     a3_acXD
                     a4_acXE
                     a5_acXF
                     a6_acXG
                     a7_acXH
                     a8_acXI
            -> case b_acXA of {
                 ImportDecl b1_acXJ
                            b2_acXK
                            b3_acXL
                            b4_acXM
                            b5_acXN
                            b6_acXO
                            b7_acXP
                            b8_acXQ
                   -> case (compare a1_acXB b1_acXJ) of {
                        LT -> LT
                        EQ
                          -> case (compare a2_acXC b2_acXK) of {
                               LT -> LT
                               EQ
                                 -> case (compare a3_acXD b3_acXL) of {
                                      LT -> LT
                                      EQ
                                        -> case (compare a4_acXE b4_acXM) of {
                                             LT -> LT
                                             EQ
                                               -> case (compare a5_acXF b5_acXN) of {
                                                    LT -> LT
                                                    EQ
                                                      -> case (compare a6_acXG b6_acXO) of {
                                                           LT -> LT
                                                           EQ
                                                             -> case (compare a7_acXH b7_acXP) of {
                                                                  LT -> LT
                                                                  EQ -> (a8_acXI `compare` b8_acXQ)
                                                                  GT -> GT }
                                                           GT -> GT }
                                                    GT -> GT }
                                             GT -> GT }
                                      GT -> GT }
                               GT -> GT }
                        GT -> GT } } }
    (<) a_acXR b_acXS
      = case a_acXR of {
          ImportDecl a1_acXT
                     a2_acXU
                     a3_acXV
                     a4_acXW
                     a5_acXX
                     a6_acXY
                     a7_acXZ
                     a8_acY0
            -> case b_acXS of {
                 ImportDecl b1_acY1
                            b2_acY2
                            b3_acY3
                            b4_acY4
                            b5_acY5
                            b6_acY6
                            b7_acY7
                            b8_acY8
                   -> case (compare a1_acXT b1_acY1) of {
                        LT -> True
                        EQ
                          -> case (compare a2_acXU b2_acY2) of {
                               LT -> True
                               EQ
                                 -> case (compare a3_acXV b3_acY3) of {
                                      LT -> True
                                      EQ
                                        -> case (compare a4_acXW b4_acY4) of {
                                             LT -> True
                                             EQ
                                               -> case (compare a5_acXX b5_acY5) of {
                                                    LT -> True
                                                    EQ
                                                      -> case (compare a6_acXY b6_acY6) of {
                                                           LT -> True
                                                           EQ
                                                             -> case (compare a7_acXZ b7_acY7) of {
                                                                  LT -> True
                                                                  EQ -> (a8_acY0 < b8_acY8)
                                                                  GT -> False }
                                                           GT -> False }
                                                    GT -> False }
                                             GT -> False }
                                      GT -> False }
                               GT -> False }
                        GT -> False } } }
    (<=) a_acY9 b_acYa
      = case a_acY9 of {
          ImportDecl a1_acYb
                     a2_acYc
                     a3_acYd
                     a4_acYe
                     a5_acYf
                     a6_acYg
                     a7_acYh
                     a8_acYi
            -> case b_acYa of {
                 ImportDecl b1_acYj
                            b2_acYk
                            b3_acYl
                            b4_acYm
                            b5_acYn
                            b6_acYo
                            b7_acYp
                            b8_acYq
                   -> case (compare a1_acYb b1_acYj) of {
                        LT -> True
                        EQ
                          -> case (compare a2_acYc b2_acYk) of {
                               LT -> True
                               EQ
                                 -> case (compare a3_acYd b3_acYl) of {
                                      LT -> True
                                      EQ
                                        -> case (compare a4_acYe b4_acYm) of {
                                             LT -> True
                                             EQ
                                               -> case (compare a5_acYf b5_acYn) of {
                                                    LT -> True
                                                    EQ
                                                      -> case (compare a6_acYg b6_acYo) of {
                                                           LT -> True
                                                           EQ
                                                             -> case (compare a7_acYh b7_acYp) of {
                                                                  LT -> True
                                                                  EQ -> (a8_acYi <= b8_acYq)
                                                                  GT -> False }
                                                           GT -> False }
                                                    GT -> False }
                                             GT -> False }
                                      GT -> False }
                               GT -> False }
                        GT -> False } } }
    (>=) a_acYr b_acYs
      = case a_acYr of {
          ImportDecl a1_acYt
                     a2_acYu
                     a3_acYv
                     a4_acYw
                     a5_acYx
                     a6_acYy
                     a7_acYz
                     a8_acYA
            -> case b_acYs of {
                 ImportDecl b1_acYB
                            b2_acYC
                            b3_acYD
                            b4_acYE
                            b5_acYF
                            b6_acYG
                            b7_acYH
                            b8_acYI
                   -> case (compare a1_acYt b1_acYB) of {
                        LT -> False
                        EQ
                          -> case (compare a2_acYu b2_acYC) of {
                               LT -> False
                               EQ
                                 -> case (compare a3_acYv b3_acYD) of {
                                      LT -> False
                                      EQ
                                        -> case (compare a4_acYw b4_acYE) of {
                                             LT -> False
                                             EQ
                                               -> case (compare a5_acYx b5_acYF) of {
                                                    LT -> False
                                                    EQ
                                                      -> case (compare a6_acYy b6_acYG) of {
                                                           LT -> False
                                                           EQ
                                                             -> case (compare a7_acYz b7_acYH) of {
                                                                  LT -> False
                                                                  EQ -> (a8_acYA >= b8_acYI)
                                                                  GT -> True }
                                                           GT -> True }
                                                    GT -> True }
                                             GT -> True }
                                      GT -> True }
                               GT -> True }
                        GT -> True } } }
    (>) a_acYJ b_acYK
      = case a_acYJ of {
          ImportDecl a1_acYL
                     a2_acYM
                     a3_acYN
                     a4_acYO
                     a5_acYP
                     a6_acYQ
                     a7_acYR
                     a8_acYS
            -> case b_acYK of {
                 ImportDecl b1_acYT
                            b2_acYU
                            b3_acYV
                            b4_acYW
                            b5_acYX
                            b6_acYY
                            b7_acYZ
                            b8_acZ0
                   -> case (compare a1_acYL b1_acYT) of {
                        LT -> False
                        EQ
                          -> case (compare a2_acYM b2_acYU) of {
                               LT -> False
                               EQ
                                 -> case (compare a3_acYN b3_acYV) of {
                                      LT -> False
                                      EQ
                                        -> case (compare a4_acYO b4_acYW) of {
                                             LT -> False
                                             EQ
                                               -> case (compare a5_acYP b5_acYX) of {
                                                    LT -> False
                                                    EQ
                                                      -> case (compare a6_acYQ b6_acYY) of {
                                                           LT -> False
                                                           EQ
                                                             -> case (compare a7_acYR b7_acYZ) of {
                                                                  LT -> False
                                                                  EQ -> (a8_acYS > b8_acZ0)
                                                                  GT -> True }
                                                           GT -> True }
                                                    GT -> True }
                                             GT -> True }
                                      GT -> True }
                               GT -> True }
                        GT -> True } } }

This is huge! And so redundant.

If the implementation of the operators > etc. build on the individual compare functions anyways, then it would surely be worth it to use the compare chain that was generated for the compare function – i.e. simply use the default instance.

Also, for the compare function, maybe we should add a function

thenCmp :: Ordering -> Ordering -> Ordering
thenCmp EQ o2 = o2
thenCmp o1 _  = o1

to Data.Ord and use that to simply this to

  instance Ord l_acv3 => Ord (ImportDecl l_acv3) where
    compare a_acXz b_acXA
      = case a_acXz of {
          ImportDecl a1_acXB
                     a2_acXC
                     a3_acXD
                     a4_acXE
                     a5_acXF
                     a6_acXG
                     a7_acXH
                     a8_acXI
            -> case b_acXA of {
                 ImportDecl b1_acXJ
                            b2_acXK
                            b3_acXL
                            b4_acXM
                            b5_acXN
                            b6_acXO
                            b7_acXP
                            b8_acXQ
                   -> compare a1_acXB b1_acXJ `thenComp`
                      compare a2_acXC b2_acXK `thenComp`
                      compare a3_acXD b3_acXL `thenComp`
                      compare a4_acXE b4_acXM `thenComp`
                      compare a5_acXF b5_acXN `thenComp`
                      compare a6_acXG b6_acXO `thenComp`
                      compare a7_acXH b7_acXP `thenComp`
                      compare a8_acXI b8_acXQ

Maybe this is a nice newcomer ticket: Relatively local, rewarding, and easy to get right.

This is mkCompareFields in TcGenDeriv

Also look at the other deriving instances for refactoring possibilities!

comment:2 更新者: rwbarton (21ヵ月前)

キーワード: newcomer を削除

How does this relate to #9557?

comment:3 更新者: rwbarton (21ヵ月前)

キーワード: newcomer を追加

comment:4 更新者: nomeata (21ヵ月前)

キーワード: newcomer を削除
概要: Smaller generated instancesSmaller generated Ord instances

Ah, so I had that feeling before. I guess it is a duplicate.. or maybe this ticket is now more specific to talk about Ord.

Re note Note [Do not rely on compare]: Note that we actually do rely on compare for all but the last field. So either the note can be ignored (at least in the case of product types), or a similar chain, not using compare can be built for the other operators:

    (<) a_acXz b_acXA
      = case a_acXz of {
          ImportDecl a1_acXB
                     a2_acXC
                     a3_acXD
                     a4_acXE
                     a5_acXF
                     a6_acXG
                     a7_acXH
                     a8_acXI
            -> case b_acXA of {
                 ImportDecl b1_acXJ
                            b2_acXK
                            b3_acXL
                            b4_acXM
                            b5_acXN
                            b6_acXO
                            b7_acXP
                            b8_acXQ
                   -> (<) a1_acXB b1_acXJ ||
                      (<) a2_acXC b2_acXK ||
                      (<) a3_acXD b3_acXL ||
                      (<) a4_acXE b4_acXM ||
                      (<) a5_acXF b5_acXN ||
                      (<) a6_acXG b6_acXO ||
                      (<) a7_acXH b7_acXP ||
                      (<) a8_acXI b8_acXQ

This follows the same pattern above, just using the right connective: It should be || for (<) and (>), && for (<=) and (>=) and thenComp for compare. And one should check that this is right-associative, to get the right lazy behavior.

Still a nice ticket for new contributors.

comment:5 更新者: nomeata (21ヵ月前)

The annoying thing with newcomer tickets is that they are also usual fun and rewarding things to work on. So I gave it a shot at branch wip/T10858.

comment:6 更新者: nomeata (21ヵ月前)

With some of the ideas in #10858 implemented, the number of terms reported in -ddump-ds for Language.Haskell.Exts.Annotated.Syntax goes down from 207,936 to 202,252. An improvement, but I was hoping for more.

Using the overloaded (<>) on Ordering incurs a cost; lots of code just to call <> with the right instance. It might be worth monomorphizing that to thenCmp.

comment:7 更新者: nomeata (21ヵ月前)

Please ignore my last four comments. This idea is completely bogus for the other operators (it still works for compare). I blame it on the jet lag and better go to bed now.

comment:8 更新者: thomie (21ヵ月前)

Type of failure: None/UnknownCompile-time performance bug

comment:9 更新者: simonpj (21ヵ月前)

I'm not following the details here but:

  • I think it's an excellent idea to use thenCmp (will dramatically simplify the output of -ddump-deriv)
  • But I suspect it won't make a lot of difference in the end, because GHC will probably inline thenCmp!

But what is stranger to me is this: why don't we just derive the code for compare, and use the default methods for (>), (==) etc?

Simon

comment:10 更新者: nomeata (21ヵ月前)

But what is stranger to me is this: why don't we just derive the code for compare, and use the default methods for (>), (==) etc?

That’s what I thought at first myself, but then I stumbled over this:

Note [Do not rely on compare]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
It's a bad idea to define only 'compare', and build the other binary
comparisons on top of it; see Trac #2130, #4019.  Reason: we don't
want to laboriously make a three-way comparison, only to extract a
binary result, something like this:
     (>) (I# x) (I# y) = case <# x y of
                            True -> False
                            False -> case ==# x y of
                                       True  -> False
                                       False -> True

So for sufficiently small types (few constructors, or all nullary)
we generate all methods; for large ones we just use 'compare'.

comment:11 更新者: simonpj (21ヵ月前)

That Note looks plausible but ONLY if you recursively use (<) in the implementation of (<). But in the code you give above for Ord on ImportDecl we seem to call compare recursively when implementing (<). So we are taking the hit described in the Note, but blowing up the code much more than necessary!

comment:12 更新者: nomeata (21ヵ月前)

Note that we are calling compare in the implementation of < for all but the last argument. (You might have to scroll to the right to see it). In the not uncommon corner case of only one field, we are thus only calling <, as described in the Note.

So it seems that for large product types, the default implementation is not much worse than the generated, while for data types with few or only one field, we should generate dedicated methods. But where is the bound? What with datatypes with multiple constructors?

comment:13 更新者: ezyang (18ヵ月前)

キーワード: deriving-perf を追加

comment:14 更新者: nkartashov (16ヵ月前)

nomeata: so you have managed to get a little reduction of generated code size for compare but failed for the rest of operations?

comment:15 更新者: nomeata (16ヵ月前)

It’s been a while, so I’m not sure. From a quick reading, I think the rationale is: For large product types, the chain of compare calls can be implemented with less code using thenCmp. Also for large product types, the default implementations of (<) etc. should be good enough. One question is what “large” means here.

comment:16 更新者: erikd (14ヵ月前)

関係者: erikd を追加

comment:17 更新者: prokhorenkov (9ヵ月前)

Differential Rev(s): D2502

May I interest you all with a really newcomer-grade change?

We can just generate "compare" and "<" as usual and express other relations through "<". Like

a > b = b < a
a <= b = not $ b < a
a >= b = not $ a < b

This saves us code for 3 methods out of 5 for a small added cost of extra negation. When the code for "<" is short enough the inliner would make the code as good as can be. Otherwise added negation would be insignificant.

Adding new differential rev with the proposed changes.

comment:18 更新者: simonpj (9ヵ月前)

Differential Rev(s): D2502Phab:D2502

comment:19 更新者: RyanGlScott (9ヵ月前)

関係者: RyanGlScott を追加

comment:20 更新者: Ben Gamari <ben@…> (9ヵ月前)

In 4ff4929c/ghc:

Make generated Ord instances smaller (per #10858).

Reviewers: simonpj, bgamari, RyanGlScott, austin

Reviewed By: simonpj

Subscribers: nomeata, simonpj, thomie

Differential Revision: https://phabricator.haskell.org/D2502

GHC Trac Issues: #10858

comment:21 更新者: bgamari (9ヵ月前)

ステータス: newclosed
マイルストーン: 8.2.1
解決方法: fixed
詳しい使い方は TracTickets を参照してください。