D:\projecteuler.net\net检出hot什么意思

The G95 project
G95 is a stable, production Fortran 95 compiler available for multiple
cpu architectures and operating systems.
Innovations and
optimizations continue to be worked on.
Parts of the F2003 standard
have been implemented in g95.
Dan Nagle reported a problem where C_LOC returned the wrong thing when
given an argument of C_PTR.
My old email address,
address is no more.
planning on letting it lapse the next time I got a bill, but I think
they decided to get rid of me instead of letting me renew.
address had too much spam overhead anyhow.
Mat Cross sent in a valgrind issue (reading some free'd memory) inside
the I/O library.
Doug Cox has built some new windows builds.
Jens Bischoff sent in an old IBM fortran manual that describes some of
the algorithms they used for the intrinsic functions.
Good stuff.
John Harper sent in a correction to the manual-- the -fzero option
only zeroes scalars.
J&rgen Reuter and Elizabeth Wong sent in a problem on OSX where
initialized module variables were not being handled correctly.
out different versions of OSX handle these things.
John Reid sent in a problem with writing output from a running coarray
program-- the output has to be unbuffered when writing to a pipe
because otherwise everything gets written at the wrong time.
A little more work on erfc() today.
Got it up to x=50 now, then had a
pause because the last couple of digits ended up being wrong.
Scratched my head for a while, then did the calculation side-by-side
between the x87 assembler and the arbitrary precision python.
numerator and denominators of the rational approximation were fine,
the quotient was fine out to an ulp.
The problem turned out to be
calculating exponentials on the x87.
I'm not talking about run of the mill exponentials, I'm talking
about calculating exp(-1000).
The way this is calculated within the
x87 is to split the argument into the integer and fractional part,
calculate 2^x on the fractional part, then scale by the integer part.
For exp of -1000, the top twelve bits are the integer part.
separation, the bottom twelve bits are garbage, leaving the bottom
twelve bits of the exponential also as garbage.
So it's unfixable
without a lot of extra effort.
Michael Pock pointed out that g95 is too strict when complaining about
complex constants with spaces between the sign and the real constant
I've got the test suite going again.
Had to update the sources in
some cases because of errors that g95 now detects and in one case
threw out an old version of the HDF library that calls C function in
different ways leading to all kinds of errors.
I'm sure the current
version uses C interop features.
Evangelos Evangelou sent in a crash on array returns that has been
This was the stare at it for an hour, change half a dozen
lines and it works.
I'm glad I have the test suite working again.
More work on erf/erfc().
I happened to run across a web page the
other day that mentioned in passing that implementing erf/erfc is
"nontrivial", which makes me feel better considering all the time I've
put into it so far.
I am up to |x| = 5.
At x > 6.75, erf(x) is
indistinguishable from one, and at about x > 110, erfc(x) < tiny(0_10).
The divergent series gets better for larger x, so hopefully only a few
more intervals will be required.
I am not so sure how good my
approximations are, so I am going to set up some sort of automatic
testing at some point.
The work I put into reading and writing
floating numbers accurately is paying off.
I was thinking a little today about yesterday's post on Brent's method
for exponentials.
Part of the reason I wanted to share this is
because it is so illustrative of how numerical analysts really work.
Sure, if you're calculating an exponential like this, you can always
solve the problem by throwing more bits at it, but that isn't what
Richard Brent did.
By fiddling a little with the math, he found an
equivalent representation of the exponential that involves the
exponential power series with a value guaranteed not to have much
precision loss, some squarings that don't involve much precision
loss and a multiplication involving shifts with no precision loss at
Beautiful.
A simpler example would be if you're writing a program involving
navigation, where you'll likely end up needing (1-cos(w)) for w close
to zero, which will cause a problem because cos(w) approaches one
itself, leading to a huge loss of precision.
You too can be a Richard
Brent and remember that cosine near zero is 1 - w^2/2! + w^4/4! - ...,
making the troublesome 1-cos(x) = w^2/2! - w^4/4! + ... .
fast and easy to calculate for small w.
Your users never notice that
your program doesn't freak when they enter coordinates near the north
pole because of the care you've taken.
Michael Richmond sent in another INTENT(OUT) regression in that has
been fixed.
Several people have reported problems with using g95 on the Snow
Leopard version of osx.
Alison Boeckmann has volunteered to help me
with getting a build going.
John Reid pointed out some inaccuracies about coarray on the website
that have been fixed.
Doug Cox has built some new windows and debian builds.
Matthew Halfant sent in a crash on real do-loops that has been fixed.
Fixed another 158-warning about INTENT(OUT) dummies not being set.
Collapsed some duplicated code into a common subroutine.
like some of the remaining issues are in fact problems with the suite
and not g95.
Got more time in on erf(x)/erfc(x).
Turns out the way to calculate
erfc(x) at large x is that divergent series I was sneering at the
other day.
The trick is to recognize that it only works at large x
and to quit while you're ahead.
So from zero to one, erfc(x) is just
1-erf(x), where erf(x) is the power series.
From one to ten, erfc(x)
is the continued fraction, and from ten on up the divergent power
The power series starts working at about nine, and the
continue fraction gets painful at ten.
The two agree perfectly from
nine to ten.
I also had to spend some fixing the calculation of the exponential to
use Brent's method instead of the usual power series, which starts
losing bits for negative arguments less than around minus ten or so.
To calculate erfc(100), we need exp(-100*100) = exp(-10000).
seem excessive, but kind 10 and 16 numbers have go down to
10^{-4700}.
Brent's method (one of his lesser known) works like this: Let
r = 2^{-k} (x - n log 2),
where x is the exponential we want and n is the number of bits of
desired precision.
Pick an integer k such that 0 < r < log 2.
is done by computing the difference, then pulling the floating point
number apart into mantissa and exponent in order to figure out k and r
Solving for x, we have
x = r 2^k + n log 2
exponentiating both sides,
exp(x) = 2^n exp(r)^{2k}
Since 0 < r < log 2, it is amenable to the power series.
exp(r) to the 2k-th power involves squaring k times.
Multiplying by
2^n involves shifting, which in floating point land is just adding n
to the exponent.
The beauty of this method is none of the additions
or subtractions that lose bits.
Fixed a spurious warning for an INTENT(OUT) dummy parameter not being
Michael Richmond and Frank Muldoon reported a regression in dummy
procedures that has been fixed.
Damn... this keyboard is sooooo smooth.
I have a new keyboard for my laptop.
The old one has been getting
worse for years.
The 'd' key came unstuck a while back and I glued it
back on not very well.
I had ordered a new keyboard from Singapore,
but it turned out to be for a slightly different model.
The risers
were smaller than the old one for some reason and the existing screws
wouldn't reach.
The new one languished until yesterday when the ENTER key got stuck.
This happens every now and then, and blowing or shaking usually
removes whatever foreign matter is in the way.
But half an hour
later, no luck.
So I took a look at the new one again.
Today I went
off to the hardware store, and I bought a pair of longer screws and
nylon spacers.
The spacers are scotch-taped to the too-short risers
and the longer screws work fine.
The new keyboard is solidly in
place, and it's like typing on velvet.
I think that is about the last moving part in the laptop that has been
I realized that the glibc implementation of erf() isn't as good as I
For larger x's, you calculate erf(x) by computing
erf(x) = 1-erfc(x).
There is an asymptotic expansion of erfc(x) that
looks like:
erfc(x) \approx exp(-x^2) / (x \sqrt\pi) * (1 + PowerSeries(1/x^2))
The thing is, the series digverges.
For all x, no less.
couple of terms in the polynomial help, but then you get the
divergence problem.
The glibc implementation actually uses:
erfc(x) \approx exp(-x^2 + R(1/x^2)) / (x \sqrt\pi)
Where R(x) is a rational approximation.
Putting the polynomial in the
exponential replaces a multiplication with an addition, but there
isn't a particular reason for using a rational approximation in 1/x^2.
I tried a rational approximation in x, and it worked a lot better.
I also figured out why the mysterious "#ifdef DO_NOT_USE_THIS" are
This comments out the Horner's rule evaluation and the new
code has a set of linked partial Horner's rules.
I originally thought
there was some numerical reason for this, but re-enabling the code
didn't produce any signficant differences.
Turns out that the code
was optimized for vector processors and the partial Horner's rules
have the potential for allowing a little parallelization to happen.
Doesn't affect me for the kind=10 reals in x86 assembler, or in the
quad precision, where even a lowly addition is many processor
instructions.
Got the continued fractions for erf() and erfc() going.
The continued
fraction is a little faster for erf(), but it turns out that having a
way to calculate erfc() is the real benefit here.
Subtracting
1-erf(x) doesn't count, because of the insane precision required.
These functions are real pains in the butt.
The problem is that the
erf(x) power series is an alternating series where the numerator of
the terms is a power of x and the denominator is a factorial.
factorials eventually dominate the powers, but for larger x's that can
take a while.
In the meantime, the partial sums get huge, meaning
that catastrophic cancellation is required to get the ultimate sum
back to one.
This means that in order to get 64 or 112 bits of some
value of erf(x), many more bits of precision than that are
Using the continued fraction to calculate erfc(x) at large x's
similarly requires a vast number of bit of precision, but not as many.
It looks like this is going to be so slow for large x that I'm going
to have to use a compiled arbitrary precision library instead of my
hacked python version.
I got the new test program running on ox.
I had to adjust ox's fan
settings, too.
There are several that you can set in the bios, and I
had it set on the slowest/quietest.
The motherboard will
automatically go to the next highest setting when the CPU's get hot,
but for some reason, only one setting.
About a minute into the run, I
get the beeping that indicates overheating.
At the next highest
setting, the fan automatically bumps up to the next setting which
appears to work.
I ended up with a 12x speedup instead of the 20x that I had hoped for,
but the total elapsed time is short enough to keep testing
interactive.
The granularity is one cpu per test directory, which
worked fine once I put the largest directories first.
The other way
of doing this, a per-file granularity would be much more complicated
since some files have to be compiled to create required modules.
After a little fiddling, it looks like all of this complication would
only result in saving about twenty seconds or so, which isn't worth it.
In the ongoing project for computing erf(), it turns out the best
way to compute erf() on certain intervals is as 1-erfc().
There are a
couple of common series for computing erfc(), but none of them
practical, as far as I can see-- they call for evaluating factorials
of half-integers.
So ok, I wrote the factorials in terms of gamma
functions, pulled out the factor of \Gamma(1/2) = \sqrt{\pi}, ended up
with a series that had cofficients in terms of Guassian "chooses",
which led right back to half-integer factorials.
This isn't for evaluating erfc() within the library, this is for
evaluating erfc() in order to fit a rational approximation to it.
After casting around for another way, I hit on evaluating upper and
lower gamma functions (which lead straight to erf() and erfc()) via
continued fractions.
I'd sort of wondered for years how you actually
evaluate a continued fraction, and it wasn't long until I got Lentz's
algorithm working.
In this method, you pass a generating function to the Lentz evaluation
subroutine.
This subroutine calls the generating function with an
iteration number and the generating function returns a pair of numbers
that are the numerator and summand of the corresponding denominator.
The Lentz evaluator keeps asking for more terms until the fraction
converges.
Slick, but a little complicated when it comes to writing
the generating function.
Several bugs that people have been reporting are regressions against
earlier versions of g95.
Regressions are always something I have to
watch out for.
For a long time now, I've haven't been very good about
running the test suite.
After giving the matter some thought, I
realize that this is because it takes way too long to run.
Right now, the suite runs on the laptop that I use for g95, a 1.3G
Pentium-III.
The plan is to move it to ox, my quad core Xeon.
seen a speedup by a factor of five for the python program I've been
using for the floating point library work, and with ox's four
stomachs, I'm hoping for a factor of twenty.
I've written about 200 lines to make the testing program multicore, in
the same way as the build script.
Now I have to debug it, and it's
always a pain to debug child processes where the standard descriptors
have been redirected.
Got Remez's algorithm working.
Turns out the 'e' in the prior post is
not an input, it is an output.
You take one more x and solve for e as
part of the linear system.
This make a lot more sense.
You get an e
out that gives you a ballpark idea of how good the approximation is.
The method isn't perfect in the sense that if pushed too hard, the
rational approximation will get an extra oscillation on the edge far
outside the desired tolerance.
You have to check the final
approximation, but it looks like the basic algorithm will work
I've got three intervals working for kind=10 erf().
There are a few
more left to go.
There is a neat trick buried in glibc that I've appropriated for this
Viewed as an integer, real numbers have the sign bit as the
most significant bit, followed by a biased exponent, followed by the
most signficant bits of the mantissa.
If you load the top word of a
real number as an integer, you can do integer compares to determine if
a number is a not-a-number, infinity, or even compare with numbers
that have 16-bit mantissas.
This works because the exponent is biased-- ie an exponent of zero
really means a value of -e_max.
This was meant to simplify
calculations in hardware, but there is nothing to stop us from doing
the same thing with software.
You can't compare for equality without
comparing all the low bits of the mantissa, but you can get a sort of
"lower bounds" comparison, which works fine for figuring out which
interval to use.
For kind=16 reals, since comparisons like this are vastly faster than
polynomial calculations, I suspect that I will end up using a large
number of crude approximations instead of a small number of high order
polynomials.
Tony Mannucci pointed out that the debug symbols were still present in
the libf95.a builds.
These aren't a lot of use to the end user, and
take up about a megabyte, so I added a line in the build script to
strip the symbols.
Doug Cox has built some new windows builds.
Arjan van Dijk had a
problem with library paths on windows that Doug has fixed.
Out sick for the last couple of days.
Feeling better now.
I have continued to make sporadic progress on the extended and quad
precision transcendental functions.
I have Remez's algorithm almost
I think I'm having problems with poles, though.
A lot of transcendental functions are implemented by rational
approximations over various intervals of the functions in question.
These look something like:
p0 + p1 x + p2 x^2 + ...
R(x) = ------------------------ \approx f(x)
q0 + q1 x + q2 x^2 + ...
Which can be tuned to the function be approximated-- in my library q0
is always one, and you can specify which p's and q's are zero up
I can get a straight polynomial approximation by not using any
If you pick a set of as many x's as you have unknown p's and q's, in
the interval that you are approximating the function over, you get a
system of linear equations that you can solve for the p's and q's.
Where do the x's come from?
Since this is a library for lots of
people to use, we want the best approximation possible and a standard
way to go is to minimize the maximum error, the minimax solution.
By construction, the approximation is equal to f(x) at the x's that
you pick, and R(x) oscillates between too big and too small in the
So you pick a tolerance, e, smaller than the precision of the
target numbers and fit R(x) to f(x1)+e, f(x2)-e, f(x3)+e, ... over the
interval in question.
There is a theorem that says that you've got the minimax solution when
the values of the maximum deviations (|R(x) - f(x)|) are all the same.
If the deviation is always less than e, then you've got something
that will approximate the target function to precision e.
Remez's algorithm picks an initial set of x's, fits to get the p's and
q's and finds the x's of maximum deviation.
These points are
bracketed by the original x's, so something like a golden search
(Brent's method) can be started right away.
Once you've found where the maximum deviations are, use these for the
new set of x's.
The x's are supposed to converge
quadradically, which isn't happening for me yet.
Once I get this machinery in place, the work will be much more of a
turn-the-crank operation: find the best approximation, implement it,
Jacques Lefrere pointed out that argv[0] was being excluded from the
GET_COMMAND intrinsic.
John Harper pointed out a typo in the manual (SECNDS() is REAL, not
I've been stealing a little time here and there to work on
implementing the transcendental functions for kind=10 and kind=16
I've looked at the glibc sources and am confident that I can
adapt many of the methods for both kind=10 and kind=16.
One required component of doing something like this is high precision
arithmetic.
After all, suppose you have your brand new subroutine for
calculating Bessel functions for kind=10 reals.
How do you test such
Why, higher precision than kind=10, using slow but simple
power series that you'd never use for the library functions.
I've taken the time to write an arbitrary precision floating point
library in python.
Like the one in C buried inside g95, it uses big
integers as its basis.
There is no particular reason for speed, so
it's better to write something that is easy to read and maintain.
basics: addition, subtraction, multiplication, division and comparison
I spent a bit of today writing a subroutine that takes one
of these numbers, rounds it to a 63-bit mantissa number and generates
an assembler initialization expression for the number, ie something
.byte 0xD3,0x66,0x14,0x30,0xA7,0xD6,0xEF,0x6B,0x7B,0x3F; 3.1E-40
that can easily be pasted into a larger code.
After I get this going,
I'll end up creating programs for each of the transcendentals that
need to be done, for both implementation and testing.
J&rgen Reuter sent in a problem with the FLUSH statement in
IF-clauses that has been fixed.
John Harper pointed out that the intrinsic SECNDS extension was not
supported for kind=10 reals.
Added that.
I've finished implementing SMP coarrays for OSX.
Both x86 and ppc
versions are supported.
They work the same way as with linux, just
add a "--g95 images=10" when running your program to run it with ten
Took my own advice, and deleted everything, starting over.
into the same problem, though.
After some investigation, the problem
turned out to be some apple hackery trying to make C string functions
Disabled that.
Now things work fine.
Got everything working
shortly after that, until the "indirect jump without *" problem.
Inserted the * in the right place and now it looks like it works.
Finished the ppc/osx version.
Although the old version is archived
just in case.
OSX always requires exceptions, special cases and
outright hacks to get it working right.
I got a good start on
x86/osx, but have run into a snag that causes mysterious compiler
Not totally sure where the problem is, but I suspect that the
way to handle this one is to delete what I've done and start over.
Still on the wonderful OSX experience.
Working on getting the OSX build going again.
John Harper sent in a typo in the manual that has been fixed.
also pointed out the need for a serious revision.
Doug Cox has built some new windows and Debian builds.
Reinhold Bader sent in a weird crash involving the array form of
THIS_IMAGE() that has been fixed.
The weird corner case was the
single image case.
Jun Chin Ang sent in a question on real loop variables, and in
preparing a reply, I discovered a crash on real loops variables,
having to do with the recent fix to loop variable types.
Fixed now.
Don't use real loops, though.
Doug Cox has built some new windows and Debian builds.
I guess I've completed the desktop upgrade-- I finally put its skin
back on, put it back into its cubbyhole and have resumed regular
Stefan Birner, I wrote you back, but your ISP seems to have decided
(over the weekend!) that my ISP are bad dudes.
No big message, I just
wanted to encourage your efforts.
Reinhold Bader sent in a problem with the THIS_IMAGE() intrinsic in
single image mode that has been fixed.
Doug Cox has built some new windows builds.
The problem with the
crash in the options processor looks like it has been fixed.
John Reid reported that his SMP coarray problem went away.
had success on ia64 and x86-64.
I'm very happy.
A hard debug session
that produces results is always very satisfying.
I spent a some time porting some of the recent fixes to SMP coarrays
to the nascent windows version.
The code is similar in some ways, but
some fixes just don't apply because of the wildly different approaches
that have to be used.
Reinhold send in a coarray crash on ia64 that has been fixed.
was a nasty bug to find.
Reinhold's institution is in Germany, and
the connection from here (Arizona) is about 10k/sec.
The debug line
numbers are way off, somehow, reducing the debugging to print
statements.
But after a couple dozen tries, I found it.
We're hoping
this fix will fix another bug reported by John Reid.
Larry Wetzel, George Madzsar, Xavier Leoncini and most of the gg95
newsgroup have seen a problem with the windows versions dealing with a
crash in the options processor.
Doug and I have been fiddling with
this for a few days now.
I caused it when I tried to remove a
deprecated option (-I-).
Although half of the effects of -I- are
doable with -iquote, that isn't the property of -I- that I was relying
I think I've got a workaround for the problem, fingers are
Doug seems to have fixed the problem manually, new builds
Doug Cox has built some new debian builds.
We've been working on a
crash in the options handling introduced the other day when I finally
got rid of the -I- in the builds without really remembering why it was
I think it was for the windows build, but the option is
deprecated in favor of -iquote, and it's time to move on.
John Harper sent in a new problem with IEEE_CLASS derived types that
has been fixed.
I got a neat new script working for the first time.
I was working on
another bug (that eluded me today).
The bug apparently only bites on
64-bit machines, so I wanted to use ox for testing.
Problem was, I
wasn't actually at home.
I'd run into this problem the other day, so
I had my wake-on-lan script ready.
The way this works, you open a
socket and send a broadcast packet that contains the MAC address of
the machine you want to wake up in a special format.
The target
machine is mostly asleep, but when the network interface reads the
packet with its own MAC address, it wakes the rest of the machine
The reverse command, is of course, "poweroff".
John Reid and Reinhold Bader pointed out that J3 renamed the coarray
ALL STOP statement to ERROR STOP at the most recent meeting.
off on making this change, but J3 has since taken a vote that freezes
the whole draft more than what it was.
I've gone ahead and made the
The changes extend internally from the compiler to the
libraries.
The ALL STOP statment will continue to be accepted as a
John Harper and Nicholas Peshek reported problems with the x86/linux
version, specifically a problem not finding glibc-2.7.
It turned out
that the x86 build was building against the libraries on my desktop
It didn't suffice to compile against an old library, the
newer compiler generated a dependence on the newer glibc.
So I ended
up compiling an older version of gcc and building what was essentially
a cross compile environment for the older library.
The new build environent is the old environment.
Instead of copying
headers and library from the old system, I ended up copying files from
the old backups, since the old opensuse 10 is on a toasted disk.
really hate pointless upgrades, and hate to force others to upgrade
when unnecessary.
The most notable thing about the old compiler is how fast it is.
continues to bloat, and with bloat comes sloth.
No one notices this
because machines are faster.
Reinhold Bader and John Reid sent in a problem with allocatable
coarrays, turned out to be two separate problems.
Cuneyt Sert sent in a broken link on the blog that has been fixed.
Michael Richmond caught some debug code that had been left in from
Reinhold and John's bug from the previous night.
This was a tough bug
to track, since the problem reared its head only after enough images
were enabled.
The debug code is gone.
Michael also sent in some regression to the CALL statement caused by
recent fixes.
Doug Cox has built some new windows builds.
The 32-bit debian build
is broken at the moment.
We are investigating.
John Harper sent in a subtle problem with the IEEE_ARITHMETIC module.
The IEEE module have a series of intrinsic derived type for various
kinds of numbers, etc.
The .eq. and .ne. operators are defined for
some of these derived types, and you can now USE these operators like
user-defined operators.
Yuri Sohor sent in an illegal coarray code-- he had a derived type
that he was assigning across images.
On a regular code, this causes
allocations to be copied.
On a coarray code, this would have been
hideous to implement.
It turns out to be legal to pass around
derived types that contain pointers, but the pointers are considered
to be undefined on different images.
I've replaced the reporting of signal numbers on a crash with signal
names for the more popular signals.
Michael Richmond sent in a regression involving subroutine calls.
problem was the recent fix to calling host associated procedure
Got it all fixed now, I hope.
Reinhold Bader sent in a bug with the deallocation of allocatable
coarrays that has been fixed.
This was the same bug I was working on
when my hard disk started its death-rattle.
It's actually quite a
good thing that a disk starts making noises before it fails.
fixed a problem the allocation of coarrays in derived types.
Predictably, today was mostly spent working on getting g95 building on
the laptop.
I've removed the -I- directive in the build-- I can no
longer recall why it was there, and it seemed to mess up finding the
system include files.
The real struggle was getting the library to
I'd have thought this would be easier because the configuation
scripts are more straightforward.
But of course the versions of
autoconf have changed and subtle changes were apparently required.
I'm finally back to development.
A long struggle today to get the wireless card on the laptop going.
My laptop has an internal wireless card that has never worked, even
under windows.
Turned out that was interfering with the pcmcia card.
The windows side downloaded another set of updates, bringing me back
up to SP3.
It's a weird thing to apply the updates and see that there
are... 71 of them.
The upshot is that the laptop is ready to host g95 again.
this way because I can work on g95 even without a connection to the
If I do have a network connection, I connect to my desktop
via the vpn to access mail.
I'm also going to use the opportunity to migrate some things back to
the desktop, like my lilypond files.
The laptop has the worst
display, and is consequently the worst place to do music engraving.
I'm toying with the idea of not even installing X windows.
Been sick all weekend.
To add insult to injury, I've been back to
upgrade hell, this time with the new disk.
This time was worse,
because I use the laptop as a dual boot system with windows and linux.
Spent yesterday (in my reduced capacity) trying to install windows.
The laptop came with OEM installation disks for windows.
Because the
disks are OEM, they weren't particularly polished, and the
partitioning part didn't work.
Because I have the arch linux disk, I
could easily look and set the partitions.
The OEM would install,
finish, reboot the system... then lock up on reboot.
I tried and tried and tried.
Half an hour to go through the whole
3-CD install and I must have done it a dozen times.
The key was the
realization that the installation software expected an NTFS partition
and wouldn't even mark it bootable after it ran.
Ergo, the OEM
software expected the partition to already be in a particular state,
more than just a bootable flag.
I ended up installing an even older
windows 2000, to where it would boot itself, then wiped it with the
It worked-- win2k must have installed a part of the bootstrap
loader that the OEM installed wasn't.
Of course, once windows is going, it's time to get unix going.
reason is that unix can live on other partitions and its installer
takes care of snarfing up the windows boot code, doing the dual boot
installation went
smoother than usual, the third time being the charm.
Arch does have
this problem with downloads getting stalled and being downright passive
about retrying stalled downloads.
After the nth try, I got the basic
install done.
The nasty surprise was on the first boot.
The kernel
came up, the first couple of items in the initialization scripts ran,
the screen went blank and stayed that way.
Now this was a fine pickle.
The only way to go from here was to boot
from the CD, mount the drive, edit the startup scripts and reboot.
The CD wasn't real reliable on the laptop, making the whole process
vastly more irritating that it would normally have been.
about giving up, but the CD worked fine, if the laptop saw it.
Eventually tracked the problem to something in the udev demon.
some head-scratching, I guessed that udev was forcing some buggy
module to load.
After guessing wrong with some manufacturer-specific modules, it hit
me-- if the screen is going blank, the problem must be in a screen
But how to list modules on a system without a screen?
over the network of course.
Small problem, though, the base install
doesn't include things like sshd, and you can't see what you're
typing, so you can't log in remotely!
So, how to fix that.
Turned I out I had a copy of the telnetd source
It's small, about 2k lines of C.
Compiled it on my desktop
system, scp'ed it up to the g95 website.
Loged into the laptop,
blindly, carefully typed a wget command, then chmod it to make it
executable, then run it.
Try telnet from the desktop and whattaya
know, it worked the first time.
Login as root, permission denied.
Gotta love these modern security measures.
The only way around
that one was a user account.
Had to create a user account on the
laptop, typing blind again.
With a working telnetd, I finally got to do an lsmod, and it turned
out that there was indeed a display driver, i915 being loaded.
blacklisted the module and things worked a lot better.
downhill after that, just regular installation of things that aren't
After some success on unix, I moved back to windows, basically an XP
with sp1 and IE6.
Upgrading windows went smoother-- you just let the
updater tell you about the critical security flaws, download, install
and reboot.
Repeated that four times before it go to SP2.
in and it doesn't want to move on to SP3 for some reason.
While I was
staring at IE6, wondering how to upgrade it without getting pwned, I
decided to go with Google Chrome.
Very nice.
I've got skype, putty,
wireless networking and am on my way.
Not bad for an otherwise wasted day.
I've been more lucid for about
five hours now and hopefully got this cold licked.
Got all of the data off of my laptop.
Even if I'd lost it, the
difference between what was there and the most recent backup would
have been trivial to redo.
I have a new disk for the laptop, but have
been battling a cold the whole weekend.
Doug Cox has built some new windows builds.
I've added support for floor() and ceiling() for kind=16 reals.
No build today.
I was working away on g95 when I noticed the hard
drive on my laptop (where g95 lives) started making bad noises.
Checked the logs and read errors are starting to happen.
Time for a
new drive.
So I've shut down for the night to cool it off.
thing tomorrow, I'll power it up and copy everything off of it.
procedure as the usual backup, except I'll skip compression to get
everything off as fast as possible.
The laptop and drive are 7-8 years old, so it's not that unexpected.
I actually have new keyboard on the way, since several of the keys
don't work so well.
John Harper pointed out that NINT() was not implemented for kind=16
Got that done now.
Also noticed some internal rounding
problems with kind=16 reals that have been fixed.
John Reid sent in a SMP coarray regression that has been fixed.
Reinhold Bader sent in a crash on an illegal (coarray) program that
has been fixed.
Christian Speckner sent in a problem with procedure pointers in
internal procedures that has been fixed.
Doug Cox has built some new windows builds.
John Harper sent in a problem with Bessel functions being generic that
has been fixed.
John McFarland sent in another problem with abstract interfaces.
time, the problem was that the abstract attribute wasn't being
propagated through modules.
This necessitated a change in the module
version-- the new modules produced by g95 won't work with previous
Doug Cox has built some new windows builds.
John McFarland sent in another unique regression with abstract
procedure pointers that has been fixed.
My John's original problem
was a little too broad, but we're getting there.
Michael Richmond helped clear up a lot of confusion in the various
coarray packages.
If you compile a coarray program with the g95 in
the g95-cocon-* packages, you get a network version, meant to run
under the coarray console, which builds the network.
The regular downloads have the SMP versions, which are accessed by
the --g95 image=x option.
Running the network version
with the --g95 image=x option will now print a helpful
warning message.
On top of all that, the SMP version was not being
compiled into the x86-linux build.
The regular builds having the
SMP version are: x86-linux, x86_64/linux, ia64/linux and
alpha/linux.
Reinhold Bader continued the SMP coarray shakedown with a problem with
allocatable coarrays in single-image mode and some addressing
weirdness on IA64.
Both fixed.
John McFarland sent in a regression with abstract procedure pointers
that has been fixed.
Michael Richmond and I were getting pretty confused about version
numbers, until we realized that the version numbers were not
propagating into the library version number.
Got that fixed and added
a new documentation line describing the new '--g95 images='
Martien Hulsen sent in a problem with MOVE_ALLOC() that has been
John Harper sent in a bug with NINT(), which returned an incorrect
result when converting a kind=10 real on x86-64 to a kind=8 integer.
The problem was ABI conventions-- x86-64 returns kind=8 integers in
the rax register, not the edx:eax pair on x86.
Fixed the analogous
problems with FLOOR() and CEILING().
John McFarland sent a crash on an illegal procedure pointer
declaration that has been fixed.
John also sent a correct program
that was giving spurious errors for a procedure pointer assignment,
also fixed.
Doug Cox built some new windows builds for MinGW and Cygwin.
discontinuing the builds against gcc 4.0, because of compatibility
reasons against windows 98.
Piet Wertelaers sent in a problem with TRANSFER() that has been
I've fixed the web page on the sourceforge site.
Normally this is
kept automatically in sync with www.g95.org, but my desktop upgrade
caused a change in the ssh key.
Sourceforge now has the new key.
John Reid sent in another bug with SMP coarrays that has been fixed.
This one had to do with passing coarrays though dummy arguments.
Johan Klassen sent in a gift-- US $80.00, for beer.
Thanks Johan!
won't all go for beer-- Four Peaks has delicious beer bread that I
am partial to.
Pedro Lopes and Delbert Franz reported that the new Debian package
I had to change the upload script, and some status messags
were intermingled with the archive itself.
Fixed now.
Reinhold Bader and I were discussing a bug that seems to have
Reinhold did supply several libraries that have enabled me
to build the coarray console on IA64 again.
The gateway machine at
Reinhold's institute is a 16-core IA64.
There are rules against using
it for computation, but it was 0-dark-thirty, and it was cool to run
the monte carlo pi calculation on all 16 cores.
Doug Cox has built some new Debian packages.
I've also decided that the 'g95' name is getting too
antiquated.
From now on, it will now be known as 'h96'.
John Reid sent in another pair of problems with SMP coarrays that have
been fixed.
Reinhold Bader sent in a problem with allocatable coarray components
that has been fixed.
Jimmy James pointed out (from a conversation on gg95) that the trip
counts for kind=8 do-loops were limited to the default integers.
Loops with kind=8 variables now have kind=8 trip counts, this is not
general-- loops with kind=1 iterator variables will still have default
integer trip counts.
Doug Cox built some new Debian packages.
Michael Richmond sent in the necessary files to update the alpha build
from Debian 4.0 to 5.0.
Things went smoothly.
Since it was linux, it
was easy to add support for the SMP coarrays.
Reinhold Bader sent in a missing startfile that let me complete the
IA64 build.
John Reid (all hail the convener), sent in a problem with the SMP
coarray implementation.
Writes by separate images to standard out
can't overlap the records that they write, so there needs to be a
mutex on standard output.
The network version multiplexes the output
in a different way, leaving a lock unnecessary in that case.
John also sent another problem with allocatable coarrays that has been
Elliott Estrine pointed out that the new linux build was against
glibc-2.7, which causes a problem for people on older systems.
switched the linux build back to glibc-2.6.
J&rgen Reuter sent in nasty bug involving initialization of
recursive structures with multidimensional arrays that has been fixed.
Nick Yas'ko sent in a problem with bad USE statements that has been
John Harper sent in a problem with the solaris build-- the solaris
linker complains about unaligned debugging relocations.
compiling without debugging symbols.
John lives in New Zealand-- the new version of pine that I'm now using
lists the date on his mails (correctly) as 'tomorrow'.
Martien Hulsen, Maarten Becker and Michael Richmond reported a problem
with the x86-64 being unable to find the system startfiles like
crt1.o, etc.
This was an oversight on my part-- the x86-64 build is
compiled using my own x86-64 system which runs a new
distribution.
distribution is a new-style system which does not have support for
legacy 32-bit programs, so the startfiles are in /usr/lib.
I've fixed
g95 to search /usr/lib64 before /usr/lib when searching for startfiles
so that things will work on older systems.
Jacques Lefr&re pointed out a nasty problem with the download
directories that has been fixed.
Firmly stuck in mail hell today.
I thought I had the mail set up, but
when I download a letter, it gets appended to mailbox in a different
format than the other letters-- the 'new' format involves the list of
the servers that the mail has passed through, and that doesn't quite
jibe with the mbox format.
I thought I'd lost the last couple of
letters, but they just appear glued onto the last message that pine
thinks is there.
No clue on this one for now.
Update-- I found formail...
Mail is just a hideous
hideous problem.
You get to be an expert at it when you get your
system going, then forget everything you learned and have to re-learn
it when you upgrade your system...
Finished work on the new build system.
I've unfortunately deluded myself about the state of my mail setup.
I neglected my spam filter of choice,
I can send, but can't
quite get mail correctly yet.
Lots of new stuff.
New build is up.
New version of g95, now 0.93.
I've finished the 2.0 version of the build system.
It does all that
the previous one did and much more.
It's doubled in size, from a
thousand lines of python to just under two thousand.
The old one
built by forking a subprocess for each build, logging into remote
machines, building g95 and uploading the result.
This capability is
still intact, but I am now relying much more on local cross compiles.
A "cross compiler" is a compiler that runs locally on one machine, and
compiles a program meant to run on another machine, the 'target'
Cross compilers are tricky to build, and require a lot of
At the end, even a relatively easy build was taking about
three to four hours to complete.
I now have nearly a dozen of these,
and they build g95 just fine, except for a couple systems I don't
quite have access to.
A few systems remain as remote builds for now.
It's impressive to watch a full build taking place.
The complete
build takes about the same time as it did before, the slowdown not
coming from a particularly slow legacy machine, but from the sheer
volume of simultaneous processes running.
The load average peaks
around eight, memory usage increases by about a hundred and fifty megs
By modern standards, cheap.
New targets include freebsd7 for x86-64 and opensolaris for x86.
A minor sub-innovation I'm pleased with is in the creation of the
tarballs themselves.
The tar format is quite simple and instead of
putting everything into a temporary directory structure and running
'tar', the structure of the tarball is instead described by a data
structure, which is traversed to create the tarball.
Compiling up a
tarball, if you will.
This approach avoids a lot of unnecessary
copying of files all over the place, lets me create the 'owner' of the
files as 'g95' without actually having a 'g95' user on the system,
makes all the timestamps identical and creates symbolic links out of
nothingness.
The gzip process is still done with 'gzip', though.
The new build system also integrates another new feature-- SMP-only
coarray support.
This version won't operate over networks, but it is
free for use by anyone and is in the new version.
There is currently
a limit of 32 images hardcoded into the library, but I would be happy
to compile another version if anyone is using some monster system.
will probably remove the limit sometime soon-- it just makes the first
version easier.
SMP coarrays are currently available on x86/linux and x86-64/linux.
Other unixes shouldn't be that difficult.
I am planning a similar
version for windows machines.
I started with a copy of the unix
library and have started work on the port, although there are so many
changes that calling it a 'rewrite' is probably more accurate.
spent quite some time poring through
and have figured out how to
do everything.
It's pretty much the same-- the coarrays are stored in
shared memory, pipes and mutexes will be used for the various
synchronization primitives.
The orginal process spawns all of the
images and so on.
This also required a change in how g95-compiled programs handle the
command line, since that is the easiest way to specify the number of
images you want.
As it was, the special command line argument --g95
caused a printout of the fortran runtime settings to be printed
instead of running your program.
Now, the --g95 signals the start of
arguments that you want to pass to the runtime library.
The special
argument -- indicates that all following arguments should be passed to
your program.
If a --g95 is given and no additional arguments are
given, then the runtime variable settings are printed.
For example:
./a.out --g95
./a.out --g95 images=10
./a.out --g95 images=10 -- a b
./a.out a b --g95 images=10
./a.out a --g95 images=10 -- b
./a.out -- --g95
I was running the monte carlo caluclation of pi in the
on ox, my quad core xeon system.
Things were going just fine for about ten minutes, until I heard that
european two-tone police siren noise and the flashing red LED that ox
does when it is getting a little too hot.
I had set the fan on the
lowest setting for noise abatement reasons, and the fan does go to the
next level when things get hot, but with all four cores going at once,
it needed the third level.
I also took some time to install the
newfangled CPU temperature monitors.
My system upgrade still isn't quite done.
The old disk is mounted on
the new disk.
My home directory had become quite flat over the years
(ie lots of files in it), and I've been working on making it more tree
Download, list, install, remove, edit config files, search the web for
docs, read docs, give up, start again.
My life for the last
This is why these things get put off.
The system is coming
together, though.
, and have been pretty happy
The basic installation gets you to a bare bones system where
you get to install the individual packages.
I've never been very
happy with Suse on my laptop (where g95 currently lives).
more of a kitchen-sink distribution.
Arch on the other hand is a
million pieces, but they come together surprisingly well.
Your desktop system should fit you like a soft leather glove.
steel guantlet.
The old system was more like a dried-out leather
glove that has become too stiff to be useful.
The video system had a nasty problem that took a long time to find.
If a character had a blue foreground, there was this weird line
through the letter, always at about the eight row from the top.
much head scratching, I found out that there is an 'underline line'
VGA registers.
Apparently this was getting set to eight somewhere,
after I set it to 31, the problem went away.
The wonderful console system I described in the last post doesn't
Although the monitor reports being in , the monitor
displays only about 148 columns (ie 1184 dots).
After much
head-scratching, I remembered what happened last time-- after running
in that mode for a while, the monitor eventually figures out that
there is a better way of displaying things.
I've got ssh.
I've got compilers.
I've got X.
I've got printer
The mpage program is gone, but a2ps replaces it.
vpn is going.
Gpm is working, after I had to re-enable cross console
cutting and pasting.
I have enlightenment (a window manager), but
haven't figured out how to configure some keyboard shortcuts that old
time irix users would recognize.
I've got firefox-- the mouse wheel
now works (yaay!), which it didn't with the ancient RedHat 6/Mozilla.
The old system has netscape 3 running on it, and works amazingly well,
rendering pages that the old mozilla would lock up or crash on.
the new firefox renders them all.
I've got DRI going.
editors-- vi, nano... and emacs after a struggle.
The mail system is in place.
Under the new system, I'll get and
receive mail from g95.org, which was another reason for the upgrade.
The old system looked like:
Outbound: Me -> pine -> qmail -> ssh tunnel -> g95.org ISP -> You
You -> firstinter.net -> fetchmail -> inbox -> pine -> Me
I had qmail built, installed and was dreading the configuration when I
realized that I didn't need it.
I think I was using it mainly to
buffer outbound mail if the connection went down.
A couple years ago,
that went badly when the ssh tunnel went down instead, and I didn't
notice it for a week until my outbound mail finally started bouncing.
The new mail system looks like:
Me -> alpine -> ssh tunnel -> g95.org ISP -> You
Inbound: You -> g95.org ISP -> fetchmail (via ssh tunnel) -> inbox -> alpine -> me
The reason for the ssh tunnel is that the g95.org's ISP (bluehost)
doesn't let just any schmoe connect to port 25 to relay their mail.
This has to happen locally as far as they are concerned.
The tunnel
on the inbound leg isn't strictly necessary (although IMAP sends
passwords in cleartext), but the tunnel is already there and a little
encryption never hurts.
The pine program is gone, but alpine is its new incarnation.
pretty secure by default because it runs no daemons, not even inetd.
Nevertheless, I have a firewall rules because I don't want ssh's port
exposed to the nasty internet-- I've got ssh also listening on a
nonstandard port as well as the usual 22 for internal requests.
There are things to do, but the end is in sight.
I need TeX, which is
now 'TeX Live!' instead of the venerable TeTex.
I'm really looking
forward to having
desktop with the huge screen instead of just on the laptop with the
worst screen.
Redirecting X doesn't work well enough.
More important-- g95's new build system is taking shape.
It'll be a
major extension of the old one.
I intend to use the infrastructure
I'm building for that for other projects as well.
's 'Joel Tests'
is whether you can build your software with a single command or not.
That used to be the case with g95, but it's slowly drifted away from
that, and it is time to bring that cur to heel.
February 19
More upgrade hell.
The basic system boots now.
It's an Arch Linux
system-- modern but lightweight.
OK package manager too.
running it on ox (my quad core xeon-- it has four stomachs) for a
while now.
One thing I really like about it is that the total boot
time is about eight seconds.
Why am I refurbishing a P3 with a third
the clock speed of the quad core xeon?
I have other plans for ox.
I spent most of the time today getting the video mode right.
more important than it sounds.
If you're going to be staring at a
screen for a long time, it's really important to have it be readable.
I have a really sweet 25-inch monitor that I use which can support
, and in X, that is what it runs at.
If you tell linux to
use the framebuffer in the console modes, you can get a 200x75 text
mode display that is very crisp, but just a little too tiny for lots
The real disadvantage of the framebuffer is that scrolling
takes a visible amount of time.
On ox, I had a program that spit out about a screenfull of data per
second, and the slowdown when the output hit the bottom of the screen
was just painful to watch.
So I continue to do what I've done for a
long time-- on boot, I set the dot clock of the video card to
something well above the standard text modes without putting the card
into graphics mode.
mode, an 8x12 character cell gives
a 160x64 screen that is fast and readable.
There is a program called SVGATextMode that would program various SVGA
chipsets, but it is not much used any more.
The documentation
provided by video chipset vendors, along with code examples in
X-windows is usually enough for me to the write a little code that
manipulates the extended registers.
The big problem was finding a modeline that works.
And it's not just
enough to find the modeline, you also have to find an intermediate
mode that monitor (mine, anyhow) will accept before moving up to the
But I found them.
After that, some tweaks-- turn the
default rendering to green on black, set a decent timeout interval and
force the cursor to be a solid blue block.
My favorite thing to do with this setup is when I'm editing something
and want to check someplace else briefly, I can split the screen
vertically (in emacs), go to the other place, move the cursor to the
other window and do what I need to do.
When done, I get rid of the
other display.
Emacs will let you link the two windows together as if
you were editing a single area, but I haven't found that particularly
February 18
Deep in upgrade hell.
My desktop system, that I've used for more than
a decade, is a red hat six system.
You read that right.
I can't use
it for much since one small upgrade leads to half a dozen others.
other problem it has is that the six gigabyte disk I use is always
The plan now is to take a new disk, install a modern system on
it, and move all the old files onto the new drive, like a hermit crab
finding a new and bigger shell.
The 'new' disk is 30G, so that ought
to hold me for a while.
February 17
Alexei Averchenko sent in a problem with -fone-error that has been
Warnings were being promoted to errors if this flag was
No build yet, I'm revamping my desktop system.
February 10
I've been working down the mails.
Michael Parkinson found a bug on windows where the >> redirection is
trashing the file instead of appending.
He sent a patch that seeks to
the end, but I can't help thinking that we're seeking to the beginning
of the file somehow.
We're investigating further.
February 1
I'm changing my email.
The firstinter.net address gets way too much
Click on the link at the top left, prove your humanity to get
the new one.
January 18
Cleared up a bunch of spam + emails today.
January 17
Some anonymous person sent a $20.00 Four Peaks gift certificate.
Thanks so much-- I appreciate this more than you know.
December 21
Richard Otero sent a pound of some really high-grade Scharffen Berger
chocolates.
December 20
Only in my alternate reality is 'shortly' three months...
Patrick sent some extremely yummy chocolates, all the way from
Switzerland.
James Beard sent a gift certificate for Four Peaks, my favorite brew
That'll have to wait a little more.
Thanks guys!

我要回帖

更多关于 project.net 的文章

 

随机推荐