[IA64] tidy up return value of ip_fast_csum

While working on implementing csum_ipv6_magic, I noticed that current
version of ip_fast_csum will potentially return bits above "unsigned
short" as 1.  While no harm is done right now because all call sites
will chop off the upper bits when it uses the return value.  However,
this is still dangerous and buggy.  Here is a patch to enforce that the
function really returns unsigned short in the native register format.

The fix is free as there are plenty open slot to add one more asm instruction.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
diff --git a/arch/ia64/lib/ip_fast_csum.S b/arch/ia64/lib/ip_fast_csum.S
index 4fb132e..1f86aeb 100644
--- a/arch/ia64/lib/ip_fast_csum.S
+++ b/arch/ia64/lib/ip_fast_csum.S
@@ -68,8 +68,9 @@
 	zxt2	r20=r20
 	;;
 	add	r20=ret0,r20
+	mov	r9=0xffff
 	;;
-	andcm	ret0=-1,r20
+	andcm	ret0=r9,r20
 	.restore sp		// reset frame state
 	br.ret.sptk.many b0
 	;;