ViewVC Help
View File | Revision Log | Show Annotations | Revision Graph | Root Listing
root/cebix/BasiliskII/src/uae_cpu/compiler/codegen_x86.cpp
Revision 1.41 - (view) (annotate) - [select for diffs]
2008-02-16T22:15:00Z (16 years, 4 months ago) by gbeauche
Branch: MAIN
CVS Tags: HEAD
Changes since 1.40: +4 -4 lines
Diff to previous 1.40
Cope with assembler updates.

Revision 1.40 - (view) (annotate) - [select for diffs]
2008-01-01T09:40:35Z (16 years, 6 months ago) by gbeauche
Branch: MAIN
Changes since 1.39: +1 -1 lines
Diff to previous 1.39
Happy New Year!

Revision 1.39 - (view) (annotate) - [select for diffs]
2007-06-29T16:36:03Z (17 years ago) by gbeauche
Branch: MAIN
Changes since 1.38: +53 -0 lines
Diff to previous 1.38
Implement CMOV.B and CMOV.W translations. Only the latter has a native
x86 equivalent however.

Revision 1.38 - (view) (annotate) - [select for diffs]
2007-01-14T13:23:36Z (17 years, 5 months ago) by gbeauche
Branch: MAIN
Changes since 1.37: +15 -15 lines
Diff to previous 1.37
Fix CMOV emulation on x86_64 in case the CPU doesn't support that instruction
(which is very unlikely).

Revision 1.37 - (view) (annotate) - [select for diffs]
2007-01-14T13:07:22Z (17 years, 5 months ago) by gbeauche
Branch: MAIN
Changes since 1.36: +1 -3 lines
Diff to previous 1.36
The older code generator is now deprecated on x86-32 too.

Revision 1.36 - (view) (annotate) - [select for diffs]
2007-01-14T12:23:29Z (17 years, 5 months ago) by gbeauche
Branch: MAIN
Changes since 1.35: +127 -31 lines
Diff to previous 1.35
Use SAHF_SETO_PROFITABLE wherever possible on x86-64, it's faster. This can't
be the default because some very ancient CPUs don't support LAHF in long mode

Revision 1.35 - (view) (annotate) - [select for diffs]
2007-01-13T18:21:30Z (17 years, 5 months ago) by gbeauche
Branch: MAIN
Changes since 1.34: +37 -31 lines
Diff to previous 1.34
Remove the 33-bit addressing hack as it's overly complex for not much gain.
Rather, use an address override prefix (0x67) though Intel Core optimization
reference guide says to avoid LCP prefixes. In practise, impact on performance
is measurably marginal on e.g. Speedometer tests.

Revision 1.34 - (view) (annotate) - [select for diffs]
2006-02-26T18:49:55Z (18 years, 4 months ago) by gbeauche
Branch: MAIN
CVS Tags: nigel-build-19
Changes since 1.33: +17 -7 lines
Diff to previous 1.33
fix FETOX & FTWOTOX translations for x86_64

Revision 1.33 - (view) (annotate) - [select for diffs]
2006-02-06T23:06:54Z (18 years, 5 months ago) by gbeauche
Branch: MAIN
Changes since 1.32: +7 -3 lines
Diff to previous 1.32
Fix SAHF_SETO_PROFITABLE code for x86-64 platforms.

This was only an experiment. Improvement was marginal: only +3% on AMD64
(an Athlon 64 3200+). However, it may be interesting to test it on EM64T
(e.g. newer P4s) since an older P3/800, hence in 32-bit mode, got a +15%
improvement in Speedometer 4 benchmarks.

Rationale: lahf/seto sequences avoid load/stores to the stack (push/pop)
and it was thus hoped to be faster.

Anyhow, SAHF_SETO_PROFITABLE can only be enabled manually at this time.
Edit your generated Makefile for testing, but first make sure your CPU
supports lahf in 64-bit mode (lahf_lm flag in /proc/cpuinfo).

Revision 1.32 - (view) (annotate) - [select for diffs]
2006-01-16T21:31:41Z (18 years, 5 months ago) by gbeauche
Branch: MAIN
Changes since 1.31: +3 -2 lines
Diff to previous 1.31
more precise callee-saved register set

Revision 1.31 - (view) (annotate) - [select for diffs]
2006-01-15T22:42:51Z (18 years, 5 months ago) by gbeauche
Branch: MAIN
Changes since 1.30: +9 -1 lines
Diff to previous 1.30
fix stack alignment (theoritically but it was OK in practise) in generated
functions, move m68k_compile_execute() to compiler/ dir since it's JIT
generic and it now depends on USE_PUSH_POP (as it should)

Revision 1.30 - (view) (annotate) - [select for diffs]
2005-07-24T14:57:11Z (18 years, 11 months ago) by gbeauche
Branch: MAIN
CVS Tags: nigel-build-17
Changes since 1.29: +2 -2 lines
Diff to previous 1.29
Stop abort()'ing when we fail to recognize the underlying processor, assume
an obsolete i386 instead. Keep report on stderr though.

Revision 1.29 - (view) (annotate) - [select for diffs]
2005-07-24T14:48:27Z (18 years, 11 months ago) by gbeauche
Branch: MAIN
Changes since 1.28: +16 -18 lines
Diff to previous 1.28
recognize more P4 cores

Revision 1.28 - (view) (annotate) - [select for diffs]
2005-04-21T09:08:57Z (19 years, 2 months ago) by gbeauche
Branch: MAIN
Changes since 1.27: +7 -5 lines
Diff to previous 1.27
Recognize lahf_lm from Dual Core Opterons. This enables use of LAHF/SETO
instructions in long mode (64-bit). However, there seems to be another bug
in the JIT preventing it from being fully supported. m68k.h & codegen_x86.h
are easily fixed bug another patch is still needed.

Revision 1.27 - (view) (annotate) - [select for diffs]
2005-03-22T16:12:18Z (19 years, 3 months ago) by gbeauche
Branch: MAIN
Changes since 1.26: +8 -1 lines
Diff to previous 1.26
Allocate executable space to detect cpu features (cpuid). aka don't crash
on non-executable .data sections on x86-64 with NX support enabled.

Revision 1.26 - (view) (annotate) - [select for diffs]
2005-01-30T21:42:16Z (19 years, 5 months ago) by gbeauche
Branch: MAIN
Changes since 1.25: +2 -2 lines
Diff to previous 1.25
Happy New Year!

Revision 1.25 - (view) (annotate) - [select for diffs]
2004-11-11T07:07:55Z (19 years, 8 months ago) by gbeauche
Branch: MAIN
Changes since 1.24: +2 -2 lines
Diff to previous 1.24
fix tester for BSF flags handling

Revision 1.24 - (view) (annotate) - [select for diffs]
2004-11-08T21:10:46Z (19 years, 8 months ago) by gbeauche
Branch: MAIN
Changes since 1.23: +57 -16 lines
Diff to previous 1.23
Merge BSF simulation on P4 from Amithlon. Use 33-bit memory addressing model.

Revision 1.23 - (view) (annotate) - [select for diffs]
2004-11-02T23:28:19Z (19 years, 8 months ago) by gbeauche
Branch: MAIN
Changes since 1.22: +47 -32 lines
Diff to previous 1.22
fix JIT FPU for x86_64

Revision 1.22 - (view) (annotate) - [select for diffs]
2004-11-01T18:40:30Z (19 years, 8 months ago) by gbeauche
Branch: MAIN
Changes since 1.21: +2 -1 lines
Diff to previous 1.21
preserve r11 as the register used to resolve pointers to functions

Revision 1.21 - (view) (annotate) - [select for diffs]
2004-11-01T17:12:55Z (19 years, 8 months ago) by gbeauche
Branch: MAIN
Changes since 1.20: +28 -1 lines
Diff to previous 1.20
- affine need_to_preserve[] to get close to linux/x86_64 ABI
- optimize NOP fillers on x86-64 (based on GNU as implementation)

Revision 1.20 - (view) (annotate) - [select for diffs]
2004-11-01T16:01:51Z (19 years, 8 months ago) by gbeauche
Branch: MAIN
Changes since 1.19: +107 -42 lines
Diff to previous 1.19
revive and fix almost two-year old port to x86_64

Revision 1.19 - (view) (annotate) - [select for diffs]
2004-01-12T15:29:29Z (20 years, 6 months ago) by cebix
Branch: MAIN
CVS Tags: nigel-build-15, nigel-build-16
Changes since 1.18: +2 -2 lines
Diff to previous 1.18
Happy New Year! :)

Revision 1.18 - (view) (annotate) - [select for diffs]
2003-06-03T09:01:03Z (21 years, 1 month ago) by gbeauche
Branch: MAIN
Changes since 1.17: +2 -2 lines
Diff to previous 1.17
Call correct PUSHF/POPF macro

Revision 1.17 - (view) (annotate) - [select for diffs]
2003-03-21T19:12:44Z (21 years, 3 months ago) by gbeauche
Branch: MAIN
CVS Tags: nigel-build-12, nigel-build-13
Changes since 1.16: +5 -0 lines
Diff to previous 1.16
Remove some dead code. Start implementation of optimized calls to interpretive
fallbacks for untranslatable instruction handlers. Disabled for now since
call_m_01() is not correctly imeplemented yet.

Revision 1.16 - (view) (annotate) - [select for diffs]
2003-03-20T13:49:49Z (21 years, 3 months ago) by gbeauche
Branch: MAIN
Changes since 1.15: +29 -5 lines
Diff to previous 1.15
Detect x86-64

Revision 1.15 - (view) (annotate) - [select for diffs]
2003-03-19T17:05:02Z (21 years, 3 months ago) by gbeauche
Branch: MAIN
Changes since 1.14: +30 -3 lines
Diff to previous 1.14
Emulate CMOV in the new code generator for processors that don't support
this intruction

Revision 1.14 - (view) (annotate) - [select for diffs]
2003-03-19T16:32:51Z (21 years, 3 months ago) by gbeauche
Branch: MAIN
Changes since 1.13: +18 -8 lines
Diff to previous 1.13
Add missing wrappers of the new runtime-assembler primitives

Revision 1.13 - (view) (annotate) - [select for diffs]
2003-03-18T17:26:32Z (21 years, 3 months ago) by gbeauche
Branch: MAIN
Changes since 1.12: +968 -48 lines
Diff to previous 1.12
Add new backend, disabled for until it's proofread and fully functional
Remove obsolete string-related instructions

Revision 1.12 - (view) (annotate) - [select for diffs]
2003-03-17T22:37:55Z (21 years, 3 months ago) by gbeauche
Branch: MAIN
Changes since 1.11: +1 -1 lines
Diff to previous 1.11
clobber "cc" for flags, not "flags". Thanks Milan for noticing it.

Revision 1.11 - (view) (annotate) - [select for diffs]
2003-03-13T20:34:34Z (21 years, 4 months ago) by gbeauche
Branch: MAIN
Changes since 1.10: +15 -0 lines
Diff to previous 1.10
Implement a generic setzflg_l() for P4, thus permitting to re-enable
translation of ADDX/SUBX/BCLR/BTST/BSET/BCHG instructions. i.e. make
it faster. ;-)

Revision 1.10 - (view) (annotate) - [select for diffs]
2003-03-13T15:57:01Z (21 years, 4 months ago) by gbeauche
Branch: MAIN
Changes since 1.9: +26 -0 lines
Diff to previous 1.9
Workaround change in flags handling for BSF instruction on Pentium 4.
i.e. currently disable translation of ADDX/SUBX/B<CHG,CLR,SET,TST> instructions
in that case. That is to say, better (much?) slower than inaccurate. :-(

Revision 1.9 - (view) (annotate) - [select for diffs]
2002-10-13T11:14:24Z (21 years, 9 months ago) by gbeauche
Branch: MAIN
Changes since 1.8: +9 -0 lines
Diff to previous 1.8
Some instructions assume offsets are only 1-byte long. I don't think this
is 100% correct. Therefore, insert some asserts so that would fail.

Revision 1.8 - (view) (annotate) - [select for diffs]
2002-10-12T16:27:13Z (21 years, 9 months ago) by gbeauche
Branch: MAIN
Changes since 1.7: +62 -0 lines
Diff to previous 1.7
Add raw_emit_nop_filler() with more efficient no-op fillers stolen from
GNU binutils 2.12.90.0.15. Speed bump is marginal (less than 6%). Make it
default though, that's conditionalized by tune_nop_fillers constant.

Revision 1.7 - (view) (annotate) - [select for diffs]
2002-10-03T16:16:57Z (21 years, 9 months ago) by gbeauche
Branch: MAIN
Changes since 1.6: +2 -0 lines
Diff to previous 1.6
Don't forget to note CPU detection code mostly comes from Linux kernel.

Revision 1.6 - (view) (annotate) - [select for diffs]
2002-10-03T16:13:46Z (21 years, 9 months ago) by gbeauche
Branch: MAIN
Changes since 1.5: +25 -0 lines
Diff to previous 1.5
JIT add copyright notices just to notify people that's real derivative
work from GPL code (UAE-JIT). Additions and improvements are from B2
developers.

Revision 1.5 - (view) (annotate) - [select for diffs]
2002-10-01T09:37:03Z (21 years, 9 months ago) by gbeauche
Branch: MAIN
Changes since 1.4: +27 -0 lines
Diff to previous 1.4
- #include "flags_x86.h" here to get NATICE_CC_?? helper macros
- Add raw_cmp_b_mi() and raw_call_m_indexed() for generated
  m68k_compile_execute() function

Revision 1.4 - (view) (annotate) - [select for diffs]
2002-09-20T14:55:50Z (21 years, 9 months ago) by gbeauche
Branch: MAIN
Changes since 1.3: +1 -1 lines
Diff to previous 1.3
Fix align_jumps for athlon, that's really "16" and gcc-3.2 sources contained
the same error. ;-)

Revision 1.3 - (view) (annotate) - [select for diffs]
2002-09-19T14:59:03Z (21 years, 9 months ago) by gbeauche
Branch: MAIN
Changes since 1.2: +210 -77 lines
Diff to previous 1.2
- Rewrite raw_init_cpu() to match more details, from kernel sources.
- Add possibility to tune code alignment to the underlying processor. However,
  this is turned off as I don't see much improvement and align_jumps = 64
  for Athlon looks suspicious to me.
- Remove two extra align_target() that are already covered.
- Remove unused may_trap() predicate.

Revision 1.2 - (view) (annotate) - [select for diffs]
2002-09-18T15:56:17Z (21 years, 9 months ago) by gbeauche
Branch: MAIN
Changes since 1.1: +75 -15 lines
Diff to previous 1.1
Optimize runtime assembler with shorter equivalents when the accumulator
(%eax) is referenced along with immediates.

Revision 1.1 - (view) (annotate) - [select for diffs]
2002-09-17T16:04:06Z (21 years, 9 months ago) by gbeauche
Branch: MAIN
Import JIT compiler

Convenience Links

Links to HEAD: (view) (annotate) Links to MAIN: (view) (annotate)

Compare Revisions

This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, select a symbolic revision name using the selection box, or choose 'Use Text Field' and enter a numeric revision.

  Diffs between and
  Type of Diff should be a