[ Retrieve as mbox ]
From: Yang Zhang <yang.z.zhang@intel.com> To: xen-devel@lists.xen.org Cc: andrew.cooper3@citrix.com, tim@xen.org, Yang Zhang <yang.z.zhang@Intel.com>, xiantao.zhang@intel.com, JBeulich@suse.com Subject: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Mon, 10 Feb 2014 14:14:00 +0800 Message-ID: <1392012840-22555-1-git-send-email-yang.z.zhang@intel.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
From: Yang Zhang <yang.z.zhang@Intel.com> When enabling log dirty mode, it sets all guest's memory to readonly. And in HAP enabled domain, it modifies all EPT entries to clear write bit to make sure it is readonly. This will cause problem if VT-d shares page table with EPT: the device may issue a DMA write request, then VT-d engine tells it the target memory is readonly and result in VT-d fault. Currnetly, there are two places will enable log dirty mode: migration and vram tracking. Migration with device assigned is not allowed, so it is ok. For vram, it doesn't need to set all memory to readonly. Only track the vram range is enough. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> --- xen/arch/x86/mm/hap/hap.c | 20 ++++++++++++++------ xen/arch/x86/mm/paging.c | 9 +++++---- xen/arch/x86/mm/shadow/common.c | 2 +- xen/include/asm-x86/domain.h | 2 +- xen/include/asm-x86/paging.h | 5 +++-- xen/include/asm-x86/shadow.h | 2 +- 6 files changed, 25 insertions(+), 15 deletions(-) diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c index d3f64bd..5f75636 100644 --- a/xen/arch/x86/mm/hap/hap.c +++ b/xen/arch/x86/mm/hap/hap.c @@ -82,7 +82,7 @@ int hap_track_dirty_vram(struct domain *d, if ( !paging_mode_log_dirty(d) ) { hap_logdirty_init(d); - rc = paging_log_dirty_enable(d); + rc = paging_log_dirty_enable(d, 0); if ( rc ) goto out; } @@ -167,17 +167,25 @@ out: /* HAP LOG DIRTY SUPPORT */ /************************************************/ -/* hap code to call when log_dirty is enable. return 0 if no problem found. */ -static int hap_enable_log_dirty(struct domain *d) +/* + * hap code to call when log_dirty is enable. return 0 if no problem found. + * + * NB: Domain that having device assigned should not set log_global. Because + * there is no way to track the memory updating from device. + */ +static int hap_enable_log_dirty(struct domain *d, bool_t log_global) { /* turn on PG_log_dirty bit in paging mode */ paging_lock(d); d->arch.paging.mode |= PG_log_dirty; paging_unlock(d); - /* set l1e entries of P2M table to be read-only. */ - p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty); - flush_tlb_mask(d->domain_dirty_cpumask); + if ( log_global ) + { + /* set l1e entries of P2M table to be read-only. */ + p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty); + flush_tlb_mask(d->domain_dirty_cpumask); + } return 0; } diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c index 21344e5..ab5eacb 100644 --- a/xen/arch/x86/mm/paging.c +++ b/xen/arch/x86/mm/paging.c @@ -164,7 +164,7 @@ void paging_free_log_dirty_bitmap(struct domain *d) paging_unlock(d); } -int paging_log_dirty_enable(struct domain *d) +int paging_log_dirty_enable(struct domain *d, bool_t log_global) { int ret; @@ -172,7 +172,7 @@ int paging_log_dirty_enable(struct domain *d) return -EINVAL; domain_pause(d); - ret = d->arch.paging.log_dirty.enable_log_dirty(d); + ret = d->arch.paging.log_dirty.enable_log_dirty(d, log_global); domain_unpause(d); return ret; @@ -489,7 +489,8 @@ void paging_log_dirty_range(struct domain *d, * These function pointers must not be followed with the log-dirty lock held. */ void paging_log_dirty_init(struct domain *d, - int (*enable_log_dirty)(struct domain *d), + int (*enable_log_dirty)(struct domain *d, + bool_t log_global), int (*disable_log_dirty)(struct domain *d), void (*clean_dirty_bitmap)(struct domain *d)) { @@ -590,7 +591,7 @@ int paging_domctl(struct domain *d, xen_domctl_shadow_op_t *sc, case XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY: if ( hap_enabled(d) ) hap_logdirty_init(d); - return paging_log_dirty_enable(d); + return paging_log_dirty_enable(d, 1); case XEN_DOMCTL_SHADOW_OP_OFF: if ( paging_mode_log_dirty(d) ) diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c index 0bfa595..11c6b62 100644 --- a/xen/arch/x86/mm/shadow/common.c +++ b/xen/arch/x86/mm/shadow/common.c @@ -3418,7 +3418,7 @@ shadow_write_p2m_entry(struct vcpu *v, unsigned long gfn, /* Shadow specific code which is called in paging_log_dirty_enable(). * Return 0 if no problem found. */ -int shadow_enable_log_dirty(struct domain *d) +int shadow_enable_log_dirty(struct domain *d, bool_t log_global) { int ret; diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h index ea72db2..4ff89f0 100644 --- a/xen/include/asm-x86/domain.h +++ b/xen/include/asm-x86/domain.h @@ -169,7 +169,7 @@ struct log_dirty_domain { unsigned int dirty_count; /* functions which are paging mode specific */ - int (*enable_log_dirty )(struct domain *d); + int (*enable_log_dirty )(struct domain *d, bool_t log_global); int (*disable_log_dirty )(struct domain *d); void (*clean_dirty_bitmap )(struct domain *d); }; diff --git a/xen/include/asm-x86/paging.h b/xen/include/asm-x86/paging.h index cd7ee3b..8dd2a61 100644 --- a/xen/include/asm-x86/paging.h +++ b/xen/include/asm-x86/paging.h @@ -143,14 +143,15 @@ void paging_log_dirty_range(struct domain *d, uint8_t *dirty_bitmap); /* enable log dirty */ -int paging_log_dirty_enable(struct domain *d); +int paging_log_dirty_enable(struct domain *d, bool_t log_global); /* disable log dirty */ int paging_log_dirty_disable(struct domain *d); /* log dirty initialization */ void paging_log_dirty_init(struct domain *d, - int (*enable_log_dirty)(struct domain *d), + int (*enable_log_dirty)(struct domain *d, + bool_t log_global), int (*disable_log_dirty)(struct domain *d), void (*clean_dirty_bitmap)(struct domain *d)); diff --git a/xen/include/asm-x86/shadow.h b/xen/include/asm-x86/shadow.h index 852023d..348915e 100644 --- a/xen/include/asm-x86/shadow.h +++ b/xen/include/asm-x86/shadow.h @@ -82,7 +82,7 @@ void shadow_teardown(struct domain *d); void shadow_final_teardown(struct domain *d); /* shadow code to call when log dirty is enabled */ -int shadow_enable_log_dirty(struct domain *d); +int shadow_enable_log_dirty(struct domain *d, bool_t log_global); /* shadow code to call when log dirty is disabled */ int shadow_disable_log_dirty(struct domain *d); -- 1.7.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: Tim Deegan <tim@xen.org> To: Yang Zhang <yang.z.zhang@intel.com> Cc: andrew.cooper3@citrix.com, JBeulich@suse.com, xiantao.zhang@intel.com, xen-devel@lists.xen.org Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Mon, 10 Feb 2014 09:03:14 +0100 Message-ID: <20140210080314.GA758@deinos.phlegethon.org>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
At 14:14 +0800 on 10 Feb (1392038040), Yang Zhang wrote: > From: Yang Zhang <yang.z.zhang@Intel.com> > > When enabling log dirty mode, it sets all guest's memory to readonly. > And in HAP enabled domain, it modifies all EPT entries to clear write bit > to make sure it is readonly. This will cause problem if VT-d shares page > table with EPT: the device may issue a DMA write request, then VT-d engine > tells it the target memory is readonly and result in VT-d fault. So that's a problem even if only the VGA framebuffer is being tracked -- DMA from a passthrough device will either cause a spurious error or fail to update the dirt bitmap. I think it would be better not to allow VT-d and EPT to share pagetables in cases where devices are passed through (i.e. all cases where VT-d is in use). Tim. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Zhang, Yang Z" <yang.z.zhang@intel.com> To: Tim Deegan <tim@xen.org> Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, "JBeulich@suse.com" <JBeulich@suse.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Mon, 10 Feb 2014 08:15:16 +0000 Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0A9D72C4@SHSMSX104.ccr.corp.intel.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
Tim Deegan wrote on 2014-02-10: > At 14:14 +0800 on 10 Feb (1392038040), Yang Zhang wrote: >> From: Yang Zhang <yang.z.zhang@Intel.com> >> >> When enabling log dirty mode, it sets all guest's memory to readonly. >> And in HAP enabled domain, it modifies all EPT entries to clear >> write bit to make sure it is readonly. This will cause problem if >> VT-d shares page table with EPT: the device may issue a DMA write >> request, then VT-d engine tells it the target memory is readonly and >> result in VT-d > fault. > > So that's a problem even if only the VGA framebuffer is being tracked > -- DMA from a passthrough device will either cause a spurious error or > fail to update the dirt bitmap. Do you mean the VGA frambuffer will be used as DMA buffer in guest? If yes, I think it's guest's responsibility to ensure it never happens. > > I think it would be better not to allow VT-d and EPT to share > pagetables in cases where devices are passed through (i.e. all cases where VT-d is in use). Without VT-d and EPT share page, we still cannot track the memory updating from DMA. I think the point is that we cannot track the memory updating via DMA. So the user should use the log dirty mode carefully. Also, I am not sure whether the memory updating from dom0 and QEMU is tracked currently. > > Tim. Best regards, Yang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: Andrew Cooper <andrew.cooper3@citrix.com> To: Tim Deegan <tim@xen.org> Cc: Yang Zhang <yang.z.zhang@intel.com>, xen-devel@lists.xen.org, JBeulich@suse.com, xiantao.zhang@intel.com Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Mon, 10 Feb 2014 10:42:05 +0000 Message-ID: <52F8ACFD.7090801@citrix.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
On 10/02/14 08:03, Tim Deegan wrote: > At 14:14 +0800 on 10 Feb (1392038040), Yang Zhang wrote: >> From: Yang Zhang <yang.z.zhang@Intel.com> >> >> When enabling log dirty mode, it sets all guest's memory to readonly. >> And in HAP enabled domain, it modifies all EPT entries to clear write bit >> to make sure it is readonly. This will cause problem if VT-d shares page >> table with EPT: the device may issue a DMA write request, then VT-d engine >> tells it the target memory is readonly and result in VT-d fault. > So that's a problem even if only the VGA framebuffer is being tracked > -- DMA from a passthrough device will either cause a spurious error or > fail to update the dirt bitmap. > > I think it would be better not to allow VT-d and EPT to share > pagetables in cases where devices are passed through (i.e. all cases > where VT-d is in use). > > Tim. Sadly, this would make shared VT-d/EPT completely pointless as a feature, causing extra memory overhead in Xen by having to maintain EPT and IOMMU tables separately. Any usecase which doesn't involve dirty vram tracking (e.g. headless VM with SRIOV, PVH dom0) would be adversely affected. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: George Dunlap <George.Dunlap@eu.citrix.com> To: Yang Zhang <yang.z.zhang@intel.com> Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Jan Beulich <JBeulich@suse.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, Andrew Cooper <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Mon, 10 Feb 2014 16:13:50 +0000 Message-ID: <CAFLBxZYrQf6w3N_uz4=6eEyKL_rU6EcDoKSD90ASSUt1jOLBkw@mail.gmail.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
create ^ title it HAP dirty vram tracking breaks pass-through thanks On Mon, Feb 10, 2014 at 6:14 AM, Yang Zhang <yang.z.zhang@intel.com> wrote: > From: Yang Zhang <yang.z.zhang@Intel.com> > > When enabling log dirty mode, it sets all guest's memory to readonly. > And in HAP enabled domain, it modifies all EPT entries to clear write bit > to make sure it is readonly. This will cause problem if VT-d shares page > table with EPT: the device may issue a DMA write request, then VT-d engine > tells it the target memory is readonly and result in VT-d fault. > > Currnetly, there are two places will enable log dirty mode: migration and vram > tracking. Migration with device assigned is not allowed, so it is ok. For vram, > it doesn't need to set all memory to readonly. Only track the vram range is enough. > > Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> > --- > xen/arch/x86/mm/hap/hap.c | 20 ++++++++++++++------ > xen/arch/x86/mm/paging.c | 9 +++++---- > xen/arch/x86/mm/shadow/common.c | 2 +- > xen/include/asm-x86/domain.h | 2 +- > xen/include/asm-x86/paging.h | 5 +++-- > xen/include/asm-x86/shadow.h | 2 +- > 6 files changed, 25 insertions(+), 15 deletions(-) > > diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c > index d3f64bd..5f75636 100644 > --- a/xen/arch/x86/mm/hap/hap.c > +++ b/xen/arch/x86/mm/hap/hap.c > @@ -82,7 +82,7 @@ int hap_track_dirty_vram(struct domain *d, > if ( !paging_mode_log_dirty(d) ) > { > hap_logdirty_init(d); > - rc = paging_log_dirty_enable(d); > + rc = paging_log_dirty_enable(d, 0); > if ( rc ) > goto out; > } > @@ -167,17 +167,25 @@ out: > /* HAP LOG DIRTY SUPPORT */ > /************************************************/ > > -/* hap code to call when log_dirty is enable. return 0 if no problem found. */ > -static int hap_enable_log_dirty(struct domain *d) > +/* > + * hap code to call when log_dirty is enable. return 0 if no problem found. > + * > + * NB: Domain that having device assigned should not set log_global. Because > + * there is no way to track the memory updating from device. > + */ > +static int hap_enable_log_dirty(struct domain *d, bool_t log_global) > { > /* turn on PG_log_dirty bit in paging mode */ > paging_lock(d); > d->arch.paging.mode |= PG_log_dirty; > paging_unlock(d); > > - /* set l1e entries of P2M table to be read-only. */ > - p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty); > - flush_tlb_mask(d->domain_dirty_cpumask); > + if ( log_global ) > + { > + /* set l1e entries of P2M table to be read-only. */ > + p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty); > + flush_tlb_mask(d->domain_dirty_cpumask); > + } > return 0; > } > > diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c > index 21344e5..ab5eacb 100644 > --- a/xen/arch/x86/mm/paging.c > +++ b/xen/arch/x86/mm/paging.c > @@ -164,7 +164,7 @@ void paging_free_log_dirty_bitmap(struct domain *d) > paging_unlock(d); > } > > -int paging_log_dirty_enable(struct domain *d) > +int paging_log_dirty_enable(struct domain *d, bool_t log_global) > { > int ret; > > @@ -172,7 +172,7 @@ int paging_log_dirty_enable(struct domain *d) > return -EINVAL; > > domain_pause(d); > - ret = d->arch.paging.log_dirty.enable_log_dirty(d); > + ret = d->arch.paging.log_dirty.enable_log_dirty(d, log_global); > domain_unpause(d); > > return ret; > @@ -489,7 +489,8 @@ void paging_log_dirty_range(struct domain *d, > * These function pointers must not be followed with the log-dirty lock held. > */ > void paging_log_dirty_init(struct domain *d, > - int (*enable_log_dirty)(struct domain *d), > + int (*enable_log_dirty)(struct domain *d, > + bool_t log_global), > int (*disable_log_dirty)(struct domain *d), > void (*clean_dirty_bitmap)(struct domain *d)) > { > @@ -590,7 +591,7 @@ int paging_domctl(struct domain *d, xen_domctl_shadow_op_t *sc, > case XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY: > if ( hap_enabled(d) ) > hap_logdirty_init(d); > - return paging_log_dirty_enable(d); > + return paging_log_dirty_enable(d, 1); > > case XEN_DOMCTL_SHADOW_OP_OFF: > if ( paging_mode_log_dirty(d) ) > diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c > index 0bfa595..11c6b62 100644 > --- a/xen/arch/x86/mm/shadow/common.c > +++ b/xen/arch/x86/mm/shadow/common.c > @@ -3418,7 +3418,7 @@ shadow_write_p2m_entry(struct vcpu *v, unsigned long gfn, > /* Shadow specific code which is called in paging_log_dirty_enable(). > * Return 0 if no problem found. > */ > -int shadow_enable_log_dirty(struct domain *d) > +int shadow_enable_log_dirty(struct domain *d, bool_t log_global) > { > int ret; > > diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h > index ea72db2..4ff89f0 100644 > --- a/xen/include/asm-x86/domain.h > +++ b/xen/include/asm-x86/domain.h > @@ -169,7 +169,7 @@ struct log_dirty_domain { > unsigned int dirty_count; > > /* functions which are paging mode specific */ > - int (*enable_log_dirty )(struct domain *d); > + int (*enable_log_dirty )(struct domain *d, bool_t log_global); > int (*disable_log_dirty )(struct domain *d); > void (*clean_dirty_bitmap )(struct domain *d); > }; > diff --git a/xen/include/asm-x86/paging.h b/xen/include/asm-x86/paging.h > index cd7ee3b..8dd2a61 100644 > --- a/xen/include/asm-x86/paging.h > +++ b/xen/include/asm-x86/paging.h > @@ -143,14 +143,15 @@ void paging_log_dirty_range(struct domain *d, > uint8_t *dirty_bitmap); > > /* enable log dirty */ > -int paging_log_dirty_enable(struct domain *d); > +int paging_log_dirty_enable(struct domain *d, bool_t log_global); > > /* disable log dirty */ > int paging_log_dirty_disable(struct domain *d); > > /* log dirty initialization */ > void paging_log_dirty_init(struct domain *d, > - int (*enable_log_dirty)(struct domain *d), > + int (*enable_log_dirty)(struct domain *d, > + bool_t log_global), > int (*disable_log_dirty)(struct domain *d), > void (*clean_dirty_bitmap)(struct domain *d)); > > diff --git a/xen/include/asm-x86/shadow.h b/xen/include/asm-x86/shadow.h > index 852023d..348915e 100644 > --- a/xen/include/asm-x86/shadow.h > +++ b/xen/include/asm-x86/shadow.h > @@ -82,7 +82,7 @@ void shadow_teardown(struct domain *d); > void shadow_final_teardown(struct domain *d); > > /* shadow code to call when log dirty is enabled */ > -int shadow_enable_log_dirty(struct domain *d); > +int shadow_enable_log_dirty(struct domain *d, bool_t log_global); > > /* shadow code to call when log dirty is disabled */ > int shadow_disable_log_dirty(struct domain *d); > -- > 1.7.1 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Control reply; (Full Text)
From: Tim Deegan <tim@xen.org> To: "Zhang, Yang Z" <yang.z.zhang@intel.com> Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "JBeulich@suse.com" <JBeulich@suse.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Tue, 11 Feb 2014 10:02:02 +0100 Message-ID: <20140211090202.GC92054@deinos.phlegethon.org>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
At 08:15 +0000 on 10 Feb (1392016516), Zhang, Yang Z wrote: > Tim Deegan wrote on 2014-02-10: > > At 14:14 +0800 on 10 Feb (1392038040), Yang Zhang wrote: > >> From: Yang Zhang <yang.z.zhang@Intel.com> > >> > >> When enabling log dirty mode, it sets all guest's memory to readonly. > >> And in HAP enabled domain, it modifies all EPT entries to clear > >> write bit to make sure it is readonly. This will cause problem if > >> VT-d shares page table with EPT: the device may issue a DMA write > >> request, then VT-d engine tells it the target memory is readonly and > >> result in VT-d > > fault. > > > > So that's a problem even if only the VGA framebuffer is being tracked > > -- DMA from a passthrough device will either cause a spurious error or > > fail to update the dirt bitmap. > > Do you mean the VGA frambuffer will be used as DMA buffer in guest? If yes, I think it's guest's responsibility to ensure it never happens. > I don't think that works. We can't expect arbitrary OSes to (a) know they're running on Xen and (b) know that that means they can't DMA to or from their framebuffers. > Without VT-d and EPT share page, we still cannot track the memory > updating from DMA. Yeah, but at least we don't risk crashing the _host_ by throwing DMA failures around. > I think the point is that we cannot track the > memory updating via DMA. So the user should use the log dirty mode > carefully. Also, I am not sure whether the memory updating from dom0 > and QEMU is tracked currently. Yes, dom0 and qemu updates are tracked in the log-dirty bitmaps. Tim. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: George Dunlap <George.Dunlap@eu.citrix.com> To: Tim Deegan <tim@xen.org> Cc: "JBeulich@suse.com" <JBeulich@suse.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, "Zhang, Yang Z" <yang.z.zhang@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Tue, 11 Feb 2014 10:59:38 +0000 Message-ID: <CAFLBxZaS_Sa03VxmPqDBB0wWEdjwqEa__RbtYvwuMd=QDk9ghQ@mail.gmail.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
On Tue, Feb 11, 2014 at 9:02 AM, Tim Deegan <tim@xen.org> wrote: > At 08:15 +0000 on 10 Feb (1392016516), Zhang, Yang Z wrote: >> Tim Deegan wrote on 2014-02-10: >> > At 14:14 +0800 on 10 Feb (1392038040), Yang Zhang wrote: >> >> From: Yang Zhang <yang.z.zhang@Intel.com> >> >> >> >> When enabling log dirty mode, it sets all guest's memory to readonly. >> >> And in HAP enabled domain, it modifies all EPT entries to clear >> >> write bit to make sure it is readonly. This will cause problem if >> >> VT-d shares page table with EPT: the device may issue a DMA write >> >> request, then VT-d engine tells it the target memory is readonly and >> >> result in VT-d >> > fault. >> > >> > So that's a problem even if only the VGA framebuffer is being tracked >> > -- DMA from a passthrough device will either cause a spurious error or >> > fail to update the dirt bitmap. >> >> Do you mean the VGA frambuffer will be used as DMA buffer in guest? If yes, I think it's guest's responsibility to ensure it never happens. >> > > I don't think that works. We can't expect arbitrary OSes to (a) know > they're running on Xen and (b) know that that means they can't DMA to > or from their framebuffers. > >> Without VT-d and EPT share page, we still cannot track the memory >> updating from DMA. > > Yeah, but at least we don't risk crashing the _host_ by throwing DMA > failures around. What I'm missing here is what you think a proper solution is. It seems we have: A. Share EPT/IOMMU tables, only do log-dirty tracking on the buffer being tracked, and hope the guest doesn't DMA into video ram; DMA causes IOMMU fault. (This really shouldn't crash the host under normal circumstances; if it does it's a hardware bug.) B. Never share EPT/IOMMU tables, and hope the guest doesn't DMA into video ram. DMA causes missed update to dirty bitmap, which will hopefully just cause screen corruption. C. Do buffer scanning rather than dirty vram tracking (SLOW) D. Don't allow both a virtual video card and pass-through Given that most operating systems will probably *not* DMA into video ram, and that an IOMMU fault isn't *supposed* to be able to crash the host, 'A' seems like the most reasonable option to me. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: Tim Deegan <tim@xen.org> To: George Dunlap <George.Dunlap@eu.citrix.com> Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "Zhang, Yang Z" <yang.z.zhang@intel.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, "JBeulich@suse.com" <JBeulich@suse.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Tue, 11 Feb 2014 12:55:53 +0100 Message-ID: <20140211115553.GB97288@deinos.phlegethon.org>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
At 10:59 +0000 on 11 Feb (1392112778), George Dunlap wrote: > On Tue, Feb 11, 2014 at 9:02 AM, Tim Deegan <tim@xen.org> wrote: > > At 08:15 +0000 on 10 Feb (1392016516), Zhang, Yang Z wrote: > >> Tim Deegan wrote on 2014-02-10: > >> > At 14:14 +0800 on 10 Feb (1392038040), Yang Zhang wrote: > >> >> From: Yang Zhang <yang.z.zhang@Intel.com> > >> >> > >> >> When enabling log dirty mode, it sets all guest's memory to readonly. > >> >> And in HAP enabled domain, it modifies all EPT entries to clear > >> >> write bit to make sure it is readonly. This will cause problem if > >> >> VT-d shares page table with EPT: the device may issue a DMA write > >> >> request, then VT-d engine tells it the target memory is readonly and > >> >> result in VT-d > >> > fault. > >> > > >> > So that's a problem even if only the VGA framebuffer is being tracked > >> > -- DMA from a passthrough device will either cause a spurious error or > >> > fail to update the dirt bitmap. > >> > >> Do you mean the VGA frambuffer will be used as DMA buffer in guest? If yes, I think it's guest's responsibility to ensure it never happens. > >> > > > > I don't think that works. We can't expect arbitrary OSes to (a) know > > they're running on Xen and (b) know that that means they can't DMA to > > or from their framebuffers. > > > >> Without VT-d and EPT share page, we still cannot track the memory > >> updating from DMA. > > > > Yeah, but at least we don't risk crashing the _host_ by throwing DMA > > failures around. > > What I'm missing here is what you think a proper solution is. A _proper_ solution would be for the IOMMU h/w to allow restartable faults, so that we can do all the usual fault-driven virtual memory operations with DMA. :) In the meantime... > It seems we have: > > A. Share EPT/IOMMU tables, only do log-dirty tracking on the buffer > being tracked, and hope the guest doesn't DMA into video ram; DMA > causes IOMMU fault. (This really shouldn't crash the host under normal > circumstances; if it does it's a hardware bug.) Note "hope" and "shouldn't" there. :) > B. Never share EPT/IOMMU tables, and hope the guest doesn't DMA into > video ram. DMA causes missed update to dirty bitmap, which will > hopefully just cause screen corruption. Yep. At a cost of about 0.2% in space and some extra bookkeeping (for VMs that actually have devices passed through to them). The extra bookkeeping could be expensive in some cases, but basically all of those cases are already incompatible with IOMMU. > C. Do buffer scanning rather than dirty vram tracking (SLOW) > D. Don't allow both a virtual video card and pass-through E. Share EPT and IOMMU tables until someone turns on log-dirty mode and then split them out. That one > Given that most operating systems will probably *not* DMA into video > ram, and that an IOMMU fault isn't *supposed* to be able to crash the > host, 'A' seems like the most reasonable option to me. Meh, OK. I prefer 'B' but 'A' is better than nothing, I guess, and seems to have most support from other people. On that basis this patch can have my Ack. Is it intended for 4.4 as a bugfix (i.e. should I be checking it in?) Tim. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "Tim Deegan" <tim@xen.org> Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Yang Z Zhang <yang.z.zhang@intel.com>, George Dunlap <George.Dunlap@eu.citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Tue, 11 Feb 2014 12:57:55 +0000 Message-ID: <52FA2C63020000780011B201@nat28.tlf.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 11.02.14 at 12:55, Tim Deegan <tim@xen.org> wrote: > At 10:59 +0000 on 11 Feb (1392112778), George Dunlap wrote: >> What I'm missing here is what you think a proper solution is. > > A _proper_ solution would be for the IOMMU h/w to allow restartable > faults, so that we can do all the usual fault-driven virtual memory > operations with DMA. :) In the meantime... Or maintaining the A/D bits for IOMMU side accesses too. >> It seems we have: >> >> A. Share EPT/IOMMU tables, only do log-dirty tracking on the buffer >> being tracked, and hope the guest doesn't DMA into video ram; DMA >> causes IOMMU fault. (This really shouldn't crash the host under normal >> circumstances; if it does it's a hardware bug.) > > Note "hope" and "shouldn't" there. :) > >> B. Never share EPT/IOMMU tables, and hope the guest doesn't DMA into >> video ram. DMA causes missed update to dirty bitmap, which will >> hopefully just cause screen corruption. > > Yep. At a cost of about 0.2% in space and some extra bookkeeping > (for VMs that actually have devices passed through to them). > The extra bookkeeping could be expensive in some cases, but basically > all of those cases are already incompatible with IOMMU. > >> C. Do buffer scanning rather than dirty vram tracking (SLOW) >> D. Don't allow both a virtual video card and pass-through > > E. Share EPT and IOMMU tables until someone turns on log-dirty mode > and then split them out. That one Wouldn't that be problematic in terms of memory being available, namely when using ballooning in Dom0? >> Given that most operating systems will probably *not* DMA into video >> ram, and that an IOMMU fault isn't *supposed* to be able to crash the >> host, 'A' seems like the most reasonable option to me. > > Meh, OK. I prefer 'B' but 'A' is better than nothing, I guess, and > seems to have most support from other people. On that basis this > patch can have my Ack. I too would consider B better than A. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: George Dunlap <george.dunlap@eu.citrix.com> To: Tim Deegan <tim@xen.org>, Jan Beulich <JBeulich@suse.com> Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, Yang Z Zhang <yang.z.zhang@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Tue, 11 Feb 2014 15:55:57 +0000 Message-ID: <52FA480D.9040707@eu.citrix.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
On 02/11/2014 12:57 PM, Jan Beulich wrote: >>>> On 11.02.14 at 12:55, Tim Deegan <tim@xen.org> wrote: >> At 10:59 +0000 on 11 Feb (1392112778), George Dunlap wrote: >>> What I'm missing here is what you think a proper solution is. >> >> A _proper_ solution would be for the IOMMU h/w to allow restartable >> faults, so that we can do all the usual fault-driven virtual memory >> operations with DMA. :) In the meantime... > > Or maintaining the A/D bits for IOMMU side accesses too. > >>> It seems we have: >>> >>> A. Share EPT/IOMMU tables, only do log-dirty tracking on the buffer >>> being tracked, and hope the guest doesn't DMA into video ram; DMA >>> causes IOMMU fault. (This really shouldn't crash the host under normal >>> circumstances; if it does it's a hardware bug.) >> >> Note "hope" and "shouldn't" there. :) >> >>> B. Never share EPT/IOMMU tables, and hope the guest doesn't DMA into >>> video ram. DMA causes missed update to dirty bitmap, which will >>> hopefully just cause screen corruption. >> >> Yep. At a cost of about 0.2% in space and some extra bookkeeping >> (for VMs that actually have devices passed through to them). >> The extra bookkeeping could be expensive in some cases, but basically >> all of those cases are already incompatible with IOMMU. >> >>> C. Do buffer scanning rather than dirty vram tracking (SLOW) >>> D. Don't allow both a virtual video card and pass-through >> >> E. Share EPT and IOMMU tables until someone turns on log-dirty mode >> and then split them out. That one > > Wouldn't that be problematic in terms of memory being available, > namely when using ballooning in Dom0? > >>> Given that most operating systems will probably *not* DMA into video >>> ram, and that an IOMMU fault isn't *supposed* to be able to crash the >>> host, 'A' seems like the most reasonable option to me. >> >> Meh, OK. I prefer 'B' but 'A' is better than nothing, I guess, and >> seems to have most support from other people. On that basis this >> patch can have my Ack. > > I too would consider B better than A. I think I got a bit distracted with the "A isn't really so bad" thing. Actually, if the overhead of not sharing tables isn't very high, then B isn't such a bad option. In fact, B is what I expected Yang to submit when he originally described the problem. I was going to say, from a release perspective, B is probably the safest option for now. But on the other hand, if we've been testing sharing all this time, maybe switching back over to non-sharing whole-hog has the higher risk? Anyway, both are at least probably equal risk-wise. How easy is it to implement? -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Zhang, Yang Z" <yang.z.zhang@intel.com> To: Tim Deegan <tim@xen.org>, George Dunlap <george.dunlap@eu.citrix.com>, Jan Beulich <JBeulich@suse.com> Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Wed, 12 Feb 2014 00:53:34 +0000 Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0A9D98D6@SHSMSX104.ccr.corp.intel.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
George Dunlap wrote on 2014-02-11: > On 02/11/2014 12:57 PM, Jan Beulich wrote: >>>>> On 11.02.14 at 12:55, Tim Deegan <tim@xen.org> wrote: >>> At 10:59 +0000 on 11 Feb (1392112778), George Dunlap wrote: >>>> What I'm missing here is what you think a proper solution is. >>> >>> A _proper_ solution would be for the IOMMU h/w to allow restartable >>> faults, so that we can do all the usual fault-driven virtual memory >>> operations with DMA. :) In the meantime... >> >> Or maintaining the A/D bits for IOMMU side accesses too. >> >>>> It seems we have: >>>> A. Share EPT/IOMMU tables, only do log-dirty tracking on the >>>> buffer being tracked, and hope the guest doesn't DMA into video >>>> ram; DMA causes IOMMU fault. (This really shouldn't crash the host >>>> under normal circumstances; if it does it's a hardware bug.) >>> >>> Note "hope" and "shouldn't" there. :) >>> >>>> B. Never share EPT/IOMMU tables, and hope the guest doesn't DMA >>>> into video ram. DMA causes missed update to dirty bitmap, which >>>> will hopefully just cause screen corruption. >>> >>> Yep. At a cost of about 0.2% in space and some extra bookkeeping >>> (for VMs that actually have devices passed through to them). >>> The extra bookkeeping could be expensive in some cases, but >>> basically all of those cases are already incompatible with IOMMU. >>> >>>> C. Do buffer scanning rather than dirty vram tracking (SLOW) D. >>>> Don't allow both a virtual video card and pass-through >>> >>> E. Share EPT and IOMMU tables until someone turns on log-dirty mode >>> and then split them out. That one >> >> Wouldn't that be problematic in terms of memory being available, >> namely when using ballooning in Dom0? >> >>>> Given that most operating systems will probably *not* DMA into >>>> video ram, and that an IOMMU fault isn't *supposed* to be able to >>>> crash the host, 'A' seems like the most reasonable option to me. >>> >>> Meh, OK. I prefer 'B' but 'A' is better than nothing, I guess, and >>> seems to have most support from other people. On that basis this >>> patch can have my Ack. >> >> I too would consider B better than A. > > I think I got a bit distracted with the "A isn't really so bad" thing. > Actually, if the overhead of not sharing tables isn't very high, then > B isn't such a bad option. In fact, B is what I expected Yang to > submit when he originally described the problem. Actually, the first solution came to my mind is B. Then I realized that even chose B, we still cannot track the memory updating from DMA(even with A/D bit, it still a problem). Also, considering the current usage case of log dirty in Xen(only vram tracking has problem), I though A is better.: Hypervisor only need to track the vram change. If a malicious guest try to DMA to vram range, it only crashed himself (This should be reasonable). > > I was going to say, from a release perspective, B is probably the > safest option for now. But on the other hand, if we've been testing > sharing all this time, maybe switching back over to non-sharing whole-hog has the higher risk? Another problem with B is that current VT-d large paging supporting relies on the sharing EPT and VT-d page table. This means if we choose B, then we need to re-enable VT-d large page. This would be a huge performance impaction for Xen 4.4 on using VT-d solution. > > Anyway, both are at least probably equal risk-wise. How easy is it to > implement? > > -George Best regards, Yang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: George Dunlap <george.dunlap@eu.citrix.com> To: Tim Deegan <tim@xen.org>, "Zhang, Yang Z" <yang.z.zhang@intel.com>, Jan Beulich <JBeulich@suse.com> Cc: "Zhang, Xiantao" <xiantao.zhang@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Thu, 13 Feb 2014 15:46:06 +0000 Message-ID: <52FCE8BE.8050105@eu.citrix.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
On 02/12/2014 12:53 AM, Zhang, Yang Z wrote: > George Dunlap wrote on 2014-02-11: >> On 02/11/2014 12:57 PM, Jan Beulich wrote: >>>>>> On 11.02.14 at 12:55, Tim Deegan <tim@xen.org> wrote: >>>> At 10:59 +0000 on 11 Feb (1392112778), George Dunlap wrote: >>>>> What I'm missing here is what you think a proper solution is. >>>> A _proper_ solution would be for the IOMMU h/w to allow restartable >>>> faults, so that we can do all the usual fault-driven virtual memory >>>> operations with DMA. :) In the meantime... >>> Or maintaining the A/D bits for IOMMU side accesses too. >>> >>>>> It seems we have: >>>>> A. Share EPT/IOMMU tables, only do log-dirty tracking on the >>>>> buffer being tracked, and hope the guest doesn't DMA into video >>>>> ram; DMA causes IOMMU fault. (This really shouldn't crash the host >>>>> under normal circumstances; if it does it's a hardware bug.) >>>> Note "hope" and "shouldn't" there. :) >>>> >>>>> B. Never share EPT/IOMMU tables, and hope the guest doesn't DMA >>>>> into video ram. DMA causes missed update to dirty bitmap, which >>>>> will hopefully just cause screen corruption. >>>> Yep. At a cost of about 0.2% in space and some extra bookkeeping >>>> (for VMs that actually have devices passed through to them). >>>> The extra bookkeeping could be expensive in some cases, but >>>> basically all of those cases are already incompatible with IOMMU. >>>> >>>>> C. Do buffer scanning rather than dirty vram tracking (SLOW) D. >>>>> Don't allow both a virtual video card and pass-through >>>> E. Share EPT and IOMMU tables until someone turns on log-dirty mode >>>> and then split them out. That one >>> Wouldn't that be problematic in terms of memory being available, >>> namely when using ballooning in Dom0? >>> >>>>> Given that most operating systems will probably *not* DMA into >>>>> video ram, and that an IOMMU fault isn't *supposed* to be able to >>>>> crash the host, 'A' seems like the most reasonable option to me. >>>> Meh, OK. I prefer 'B' but 'A' is better than nothing, I guess, and >>>> seems to have most support from other people. On that basis this >>>> patch can have my Ack. >>> I too would consider B better than A. >> I think I got a bit distracted with the "A isn't really so bad" thing. >> Actually, if the overhead of not sharing tables isn't very high, then >> B isn't such a bad option. In fact, B is what I expected Yang to >> submit when he originally described the problem. > Actually, the first solution came to my mind is B. Then I realized that even chose B, we still cannot track the memory updating from DMA(even with A/D bit, it still a problem). Also, considering the current usage case of log dirty in Xen(only vram tracking has problem), I though A is better.: Hypervisor only need to track the vram change. If a malicious guest try to DMA to vram range, it only crashed himself (This should be reasonable). > >> I was going to say, from a release perspective, B is probably the >> safest option for now. But on the other hand, if we've been testing >> sharing all this time, maybe switching back over to non-sharing whole-hog has the higher risk? > Another problem with B is that current VT-d large paging supporting relies on the sharing EPT and VT-d page table. This means if we choose B, then we need to re-enable VT-d large page. This would be a huge performance impaction for Xen 4.4 on using VT-d solution. OK -- if that's the case, then it definitely tips the balance back to A. Unless Tim or Jan disagrees, can one of you two check it in? Don't rush your judgement; but it would be nice to have this in before RC4, which would mean checking it in today preferrably, or early tomorrow at the latest. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "Yang Z Zhang" <yang.z.zhang@intel.com>, "George Dunlap" <george.dunlap@eu.citrix.com>, "Tim Deegan" <tim@xen.org> Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Xiantao Zhang <xiantao.zhang@intel.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Thu, 13 Feb 2014 15:55:43 +0000 Message-ID: <52FCF90F020000780011C29A@nat28.tlf.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 13.02.14 at 16:46, George Dunlap <george.dunlap@eu.citrix.com> wrote: > On 02/12/2014 12:53 AM, Zhang, Yang Z wrote: >> George Dunlap wrote on 2014-02-11: >>> I think I got a bit distracted with the "A isn't really so bad" thing. >>> Actually, if the overhead of not sharing tables isn't very high, then >>> B isn't such a bad option. In fact, B is what I expected Yang to >>> submit when he originally described the problem. >> Actually, the first solution came to my mind is B. Then I realized that even > chose B, we still cannot track the memory updating from DMA(even with A/D > bit, it still a problem). Also, considering the current usage case of log > dirty in Xen(only vram tracking has problem), I though A is better.: > Hypervisor only need to track the vram change. If a malicious guest try to > DMA to vram range, it only crashed himself (This should be reasonable). >> >>> I was going to say, from a release perspective, B is probably the >>> safest option for now. But on the other hand, if we've been testing >>> sharing all this time, maybe switching back over to non-sharing whole-hog has > the higher risk? >> Another problem with B is that current VT-d large paging supporting relies on > the sharing EPT and VT-d page table. This means if we choose B, then we need > to re-enable VT-d large page. This would be a huge performance impaction for > Xen 4.4 on using VT-d solution. > > OK -- if that's the case, then it definitely tips the balance back to > A. Unless Tim or Jan disagrees, can one of you two check it in? > > Don't rush your judgement; but it would be nice to have this in before > RC4, which would mean checking it in today preferrably, or early > tomorrow at the latest. That would be Tim then, as he would have to approve of it anyway. I should also say that while I certainly understand the argumentation above, I would still want to go this route only with the promise that B is going to be worked on reasonably soon after the release, ideally with the goal of backporting the changes for 4.4.1. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: Tim Deegan <tim@xen.org> To: Jan Beulich <JBeulich@suse.com> Cc: Xiantao Zhang <xiantao.zhang@intel.com>, Yang Z Zhang <yang.z.zhang@intel.com>, George Dunlap <george.dunlap@eu.citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Thu, 13 Feb 2014 17:20:22 +0100 Message-ID: <20140213162022.GE82703@deinos.phlegethon.org>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
At 15:55 +0000 on 13 Feb (1392303343), Jan Beulich wrote: > >>> On 13.02.14 at 16:46, George Dunlap <george.dunlap@eu.citrix.com> wrote: > > On 02/12/2014 12:53 AM, Zhang, Yang Z wrote: > >> George Dunlap wrote on 2014-02-11: > >>> I think I got a bit distracted with the "A isn't really so bad" thing. > >>> Actually, if the overhead of not sharing tables isn't very high, then > >>> B isn't such a bad option. In fact, B is what I expected Yang to > >>> submit when he originally described the problem. > >> Actually, the first solution came to my mind is B. Then I realized that even > > chose B, we still cannot track the memory updating from DMA(even with A/D > > bit, it still a problem). Also, considering the current usage case of log > > dirty in Xen(only vram tracking has problem), I though A is better.: > > Hypervisor only need to track the vram change. If a malicious guest try to > > DMA to vram range, it only crashed himself (This should be reasonable). > >> > >>> I was going to say, from a release perspective, B is probably the > >>> safest option for now. But on the other hand, if we've been testing > >>> sharing all this time, maybe switching back over to non-sharing whole-hog has > > the higher risk? > >> Another problem with B is that current VT-d large paging supporting relies on > > the sharing EPT and VT-d page table. This means if we choose B, then we need > > to re-enable VT-d large page. This would be a huge performance impaction for > > Xen 4.4 on using VT-d solution. > > > > OK -- if that's the case, then it definitely tips the balance back to > > A. Unless Tim or Jan disagrees, can one of you two check it in? > > > > Don't rush your judgement; but it would be nice to have this in before > > RC4, which would mean checking it in today preferrably, or early > > tomorrow at the latest. > > That would be Tim then, as he would have to approve of it anyway. Done. > I should also say that while I certainly understand the argumentation > above, I would still want to go this route only with the promise that > B is going to be worked on reasonably soon after the release, ideally > with the goal of backporting the changes for 4.4.1. Agreed. Tim. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: George Dunlap <george.dunlap@eu.citrix.com> To: Jan Beulich <JBeulich@suse.com>, Tim Deegan <tim@xen.org> Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Yang Z Zhang <yang.z.zhang@intel.com>, xenbugs <xen@bugs.xenproject.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Thu, 13 Feb 2014 16:25:52 +0000 Message-ID: <52FCF210.7090702@eu.citrix.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
title 38 Implement VT-d large pages so we can avoid sharing between EPT and IOMMU owner it Yang Z Zhang <yang.z.zhang@intel.com> thanks On 02/13/2014 04:20 PM, Tim Deegan wrote: > At 15:55 +0000 on 13 Feb (1392303343), Jan Beulich wrote: >>>>> On 13.02.14 at 16:46, George Dunlap <george.dunlap@eu.citrix.com> wrote: >>> On 02/12/2014 12:53 AM, Zhang, Yang Z wrote: >>>> George Dunlap wrote on 2014-02-11: >>>>> I think I got a bit distracted with the "A isn't really so bad" thing. >>>>> Actually, if the overhead of not sharing tables isn't very high, then >>>>> B isn't such a bad option. In fact, B is what I expected Yang to >>>>> submit when he originally described the problem. >>>> Actually, the first solution came to my mind is B. Then I realized that even >>> chose B, we still cannot track the memory updating from DMA(even with A/D >>> bit, it still a problem). Also, considering the current usage case of log >>> dirty in Xen(only vram tracking has problem), I though A is better.: >>> Hypervisor only need to track the vram change. If a malicious guest try to >>> DMA to vram range, it only crashed himself (This should be reasonable). >>>>> I was going to say, from a release perspective, B is probably the >>>>> safest option for now. But on the other hand, if we've been testing >>>>> sharing all this time, maybe switching back over to non-sharing whole-hog has >>> the higher risk? >>>> Another problem with B is that current VT-d large paging supporting relies on >>> the sharing EPT and VT-d page table. This means if we choose B, then we need >>> to re-enable VT-d large page. This would be a huge performance impaction for >>> Xen 4.4 on using VT-d solution. >>> >>> OK -- if that's the case, then it definitely tips the balance back to >>> A. Unless Tim or Jan disagrees, can one of you two check it in? >>> >>> Don't rush your judgement; but it would be nice to have this in before >>> RC4, which would mean checking it in today preferrably, or early >>> tomorrow at the latest. >> That would be Tim then, as he would have to approve of it anyway. > Done. > >> I should also say that while I certainly understand the argumentation >> above, I would still want to go this route only with the promise that >> B is going to be worked on reasonably soon after the release, ideally >> with the goal of backporting the changes for 4.4.1. > Agreed. OK -- I've retitled the bug and am going to leave it open. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Control reply; (Full Text)
From: "Jan Beulich" <JBeulich@suse.com> To: "George Dunlap" <george.dunlap@eu.citrix.com>, "Yang Z Zhang" <yang.z.zhang@intel.com>, "Tim Deegan" <tim@xen.org> Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Xiantao Zhang <xiantao.zhang@intel.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Mon, 17 Feb 2014 10:18:24 +0000 Message-ID: <5301F000020000780011CCE0@nat28.tlf.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 13.02.14 at 17:20, Tim Deegan <tim@xen.org> wrote: > At 15:55 +0000 on 13 Feb (1392303343), Jan Beulich wrote: >> >>> On 13.02.14 at 16:46, George Dunlap <george.dunlap@eu.citrix.com> wrote: >> > On 02/12/2014 12:53 AM, Zhang, Yang Z wrote: >> >> George Dunlap wrote on 2014-02-11: >> >>> I think I got a bit distracted with the "A isn't really so bad" thing. >> >>> Actually, if the overhead of not sharing tables isn't very high, then >> >>> B isn't such a bad option. In fact, B is what I expected Yang to >> >>> submit when he originally described the problem. >> >> Actually, the first solution came to my mind is B. Then I realized that > even >> > chose B, we still cannot track the memory updating from DMA(even with A/D >> > bit, it still a problem). Also, considering the current usage case of log >> > dirty in Xen(only vram tracking has problem), I though A is better.: >> > Hypervisor only need to track the vram change. If a malicious guest try to >> > DMA to vram range, it only crashed himself (This should be reasonable). >> >> >> >>> I was going to say, from a release perspective, B is probably the >> >>> safest option for now. But on the other hand, if we've been testing >> >>> sharing all this time, maybe switching back over to non-sharing whole-hog has > >> > the higher risk? >> >> Another problem with B is that current VT-d large paging supporting relies > on >> > the sharing EPT and VT-d page table. This means if we choose B, then we need > >> > to re-enable VT-d large page. This would be a huge performance impaction for >> > Xen 4.4 on using VT-d solution. >> > >> > OK -- if that's the case, then it definitely tips the balance back to >> > A. Unless Tim or Jan disagrees, can one of you two check it in? >> > >> > Don't rush your judgement; but it would be nice to have this in before >> > RC4, which would mean checking it in today preferrably, or early >> > tomorrow at the latest. >> >> That would be Tim then, as he would have to approve of it anyway. > > Done. Actually I'm afraid there are two problems with this patch: For one, is enabling "global" log dirty mode still going to work after VRAM-only mode already got enabled? I ask because the paging_mode_log_dirty() check which paging_log_dirty_enable() does first thing suggests otherwise to me (i.e. the now conditional setting of all p2m entries to p2m_ram_logdirty would seem to never get executed). IOW I would think that we're now lacking a control operation allowing the transition from dirty VRAM tracking mode to full log dirty mode. And second, I have been fighting with finding both conditions and (eventually) the root cause of a severe performance regression (compared to 4.3.x) I'm observing on an EPT+IOMMU system. This became _much_ worse after adding in the patch here (while in fact I had hoped it might help with the originally observed degradation): X startup fails due to timing out, and booting the guest now takes about 20 minutes). I didn't find the root cause of this yet, but meanwhile I know that - the same isn't observable on SVM - there's no problem when forcing the domain to use shadow mode - there's no need for any device to actually be assigned to the guest - the regression is very likely purely graphics related (based on the observation that when running something that regularly but not heavily updates the screen with X up, the guest consumes a full CPU's worth of processing power, yet when that updating doesn't happen, CPU consumption goes down, and it goes further down when shutting down X altogether - at least as log as the patch here doesn't get involved). This I'm observing on a Westmere box (and I didn't notice it earlier because that's one of those where due to a chipset erratum the IOMMU gets turned off by default), so it's possible that this can't be seen on more modern hardware. I'll hopefully find time today to check this on the one newer (Sandy Bridge) box I have. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: George Dunlap <george.dunlap@eu.citrix.com> To: Tim Deegan <tim@xen.org>, Yang Z Zhang <yang.z.zhang@intel.com>, Jan Beulich <JBeulich@suse.com> Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Mon, 17 Feb 2014 12:23:45 +0000 Message-ID: <5301FF51.1060509@eu.citrix.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
On 02/17/2014 10:18 AM, Jan Beulich wrote: >>>> On 13.02.14 at 17:20, Tim Deegan <tim@xen.org> wrote: >> At 15:55 +0000 on 13 Feb (1392303343), Jan Beulich wrote: >>>>>> On 13.02.14 at 16:46, George Dunlap <george.dunlap@eu.citrix.com> wrote: >>>> On 02/12/2014 12:53 AM, Zhang, Yang Z wrote: >>>>> George Dunlap wrote on 2014-02-11: >>>>>> I think I got a bit distracted with the "A isn't really so bad" thing. >>>>>> Actually, if the overhead of not sharing tables isn't very high, then >>>>>> B isn't such a bad option. In fact, B is what I expected Yang to >>>>>> submit when he originally described the problem. >>>>> Actually, the first solution came to my mind is B. Then I realized that >> even >>>> chose B, we still cannot track the memory updating from DMA(even with A/D >>>> bit, it still a problem). Also, considering the current usage case of log >>>> dirty in Xen(only vram tracking has problem), I though A is better.: >>>> Hypervisor only need to track the vram change. If a malicious guest try to >>>> DMA to vram range, it only crashed himself (This should be reasonable). >>>>>> I was going to say, from a release perspective, B is probably the >>>>>> safest option for now. But on the other hand, if we've been testing >>>>>> sharing all this time, maybe switching back over to non-sharing whole-hog has >>>> the higher risk? >>>>> Another problem with B is that current VT-d large paging supporting relies >> on >>>> the sharing EPT and VT-d page table. This means if we choose B, then we need >>>> to re-enable VT-d large page. This would be a huge performance impaction for >>>> Xen 4.4 on using VT-d solution. >>>> >>>> OK -- if that's the case, then it definitely tips the balance back to >>>> A. Unless Tim or Jan disagrees, can one of you two check it in? >>>> >>>> Don't rush your judgement; but it would be nice to have this in before >>>> RC4, which would mean checking it in today preferrably, or early >>>> tomorrow at the latest. >>> That would be Tim then, as he would have to approve of it anyway. >> Done. > Actually I'm afraid there are two problems with this patch: > > For one, is enabling "global" log dirty mode still going to work > after VRAM-only mode already got enabled? I ask because the > paging_mode_log_dirty() check which paging_log_dirty_enable() > does first thing suggests otherwise to me (i.e. the now > conditional setting of all p2m entries to p2m_ram_logdirty would > seem to never get executed). IOW I would think that we're now > lacking a control operation allowing the transition from dirty VRAM > tracking mode to full log dirty mode. Hmm, yes, doing a code inspection, that would appear to be the case. This probably wouldn't be caught by osstest, because (as I understand it) we never attach to the display, so dirty vram tracking is probably never enabled. > And second, I have been fighting with finding both conditions > and (eventually) the root cause of a severe performance > regression (compared to 4.3.x) I'm observing on an EPT+IOMMU > system. This became _much_ worse after adding in the patch here > (while in fact I had hoped it might help with the originally observed > degradation): X startup fails due to timing out, and booting the > guest now takes about 20 minutes). I didn't find the root cause of > this yet, but meanwhile I know that > - the same isn't observable on SVM > - there's no problem when forcing the domain to use shadow > mode > - there's no need for any device to actually be assigned to the > guest > - the regression is very likely purely graphics related (based on > the observation that when running something that regularly but > not heavily updates the screen with X up, the guest consumes a > full CPU's worth of processing power, yet when that updating > doesn't happen, CPU consumption goes down, and it goes further > down when shutting down X altogether - at least as log as the > patch here doesn't get involved). > This I'm observing on a Westmere box (and I didn't notice it earlier > because that's one of those where due to a chipset erratum the > IOMMU gets turned off by default), so it's possible that this can't > be seen on more modern hardware. I'll hopefully find time today to > check this on the one newer (Sandy Bridge) box I have. So you're saying that the slowdown happens if you have EPT+IOMMU, but *not* if you have EPT alone (IOMMU disabled), or shadow + IOMMU? I have an issue I haven't had time to look into where windows installs are sometimes terribly slow on my Nehalem box; but it seems to be only with qemu-xen, not qemu-traditional. I haven't tried with shadow. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "Yang Z Zhang" <yang.z.zhang@intel.com>, "George Dunlap" <george.dunlap@eu.citrix.com>, "Tim Deegan" <tim@xen.org> Cc: Xiantao Zhang <xiantao.zhang@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Mon, 17 Feb 2014 12:37:44 +0000 Message-ID: <530210A8020000780011CDD5@nat28.tlf.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 17.02.14 at 13:23, George Dunlap <george.dunlap@eu.citrix.com> wrote: > On 02/17/2014 10:18 AM, Jan Beulich wrote: >> And second, I have been fighting with finding both conditions >> and (eventually) the root cause of a severe performance >> regression (compared to 4.3.x) I'm observing on an EPT+IOMMU >> system. This became _much_ worse after adding in the patch here >> (while in fact I had hoped it might help with the originally observed >> degradation): X startup fails due to timing out, and booting the >> guest now takes about 20 minutes). I didn't find the root cause of >> this yet, but meanwhile I know that >> - the same isn't observable on SVM >> - there's no problem when forcing the domain to use shadow >> mode >> - there's no need for any device to actually be assigned to the >> guest >> - the regression is very likely purely graphics related (based on >> the observation that when running something that regularly but >> not heavily updates the screen with X up, the guest consumes a >> full CPU's worth of processing power, yet when that updating >> doesn't happen, CPU consumption goes down, and it goes further >> down when shutting down X altogether - at least as log as the >> patch here doesn't get involved). >> This I'm observing on a Westmere box (and I didn't notice it earlier >> because that's one of those where due to a chipset erratum the >> IOMMU gets turned off by default), so it's possible that this can't >> be seen on more modern hardware. I'll hopefully find time today to >> check this on the one newer (Sandy Bridge) box I have. > > So you're saying that the slowdown happens if you have EPT+IOMMU, but > *not* if you have EPT alone (IOMMU disabled), or shadow + IOMMU? > > I have an issue I haven't had time to look into where windows installs > are sometimes terribly slow on my Nehalem box; but it seems to be only > with qemu-xen, not qemu-traditional. I haven't tried with shadow. Sorry I forgot to mention this - I too suspected the qemu version update to be one possible reason, but I'm seeing the same behavior with qemu-trad. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: George Dunlap <george.dunlap@eu.citrix.com> To: Jan Beulich <JBeulich@suse.com>, Yang Z Zhang <yang.z.zhang@intel.com>, Tim Deegan <tim@xen.org> Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Mon, 17 Feb 2014 14:51:53 +0000 Message-ID: <53022209.1060005@eu.citrix.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
On 02/17/2014 10:18 AM, Jan Beulich wrote: >>>> On 13.02.14 at 17:20, Tim Deegan <tim@xen.org> wrote: >> At 15:55 +0000 on 13 Feb (1392303343), Jan Beulich wrote: >>>>>> On 13.02.14 at 16:46, George Dunlap <george.dunlap@eu.citrix.com> wrote: >>>> On 02/12/2014 12:53 AM, Zhang, Yang Z wrote: >>>>> George Dunlap wrote on 2014-02-11: >>>>>> I think I got a bit distracted with the "A isn't really so bad" thing. >>>>>> Actually, if the overhead of not sharing tables isn't very high, then >>>>>> B isn't such a bad option. In fact, B is what I expected Yang to >>>>>> submit when he originally described the problem. >>>>> Actually, the first solution came to my mind is B. Then I realized that >> even >>>> chose B, we still cannot track the memory updating from DMA(even with A/D >>>> bit, it still a problem). Also, considering the current usage case of log >>>> dirty in Xen(only vram tracking has problem), I though A is better.: >>>> Hypervisor only need to track the vram change. If a malicious guest try to >>>> DMA to vram range, it only crashed himself (This should be reasonable). >>>>>> I was going to say, from a release perspective, B is probably the >>>>>> safest option for now. But on the other hand, if we've been testing >>>>>> sharing all this time, maybe switching back over to non-sharing whole-hog has >>>> the higher risk? >>>>> Another problem with B is that current VT-d large paging supporting relies >> on >>>> the sharing EPT and VT-d page table. This means if we choose B, then we need >>>> to re-enable VT-d large page. This would be a huge performance impaction for >>>> Xen 4.4 on using VT-d solution. >>>> >>>> OK -- if that's the case, then it definitely tips the balance back to >>>> A. Unless Tim or Jan disagrees, can one of you two check it in? >>>> >>>> Don't rush your judgement; but it would be nice to have this in before >>>> RC4, which would mean checking it in today preferrably, or early >>>> tomorrow at the latest. >>> That would be Tim then, as he would have to approve of it anyway. >> Done. > Actually I'm afraid there are two problems with this patch: > > For one, is enabling "global" log dirty mode still going to work > after VRAM-only mode already got enabled? I ask because the > paging_mode_log_dirty() check which paging_log_dirty_enable() > does first thing suggests otherwise to me (i.e. the now > conditional setting of all p2m entries to p2m_ram_logdirty would > seem to never get executed). IOW I would think that we're now > lacking a control operation allowing the transition from dirty VRAM > tracking mode to full log dirty mode. Hrm, will so far playing with this I've been unable to get a localhost migrate to fail with the vncviewer attached. Which seems a bit strange... -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "Xiantao Zhang" <xiantao.zhang@intel.com>, "George Dunlap" <george.dunlap@eu.citrix.com>, "Yang Z Zhang" <yang.z.zhang@intel.com> Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Donald D Dugger <donald.d.dugger@intel.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Mon, 17 Feb 2014 15:00:57 +0000 Message-ID: <53023239020000780011CED9@nat28.tlf.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 17.02.14 at 11:18, "Jan Beulich" <JBeulich@suse.com> wrote: > And second, I have been fighting with finding both conditions > and (eventually) the root cause of a severe performance > regression (compared to 4.3.x) I'm observing on an EPT+IOMMU > system. This became _much_ worse after adding in the patch here > (while in fact I had hoped it might help with the originally observed > degradation): X startup fails due to timing out, and booting the > guest now takes about 20 minutes). I didn't find the root cause of > this yet, but meanwhile I know that > - the same isn't observable on SVM > - there's no problem when forcing the domain to use shadow > mode > - there's no need for any device to actually be assigned to the > guest > - the regression is very likely purely graphics related (based on > the observation that when running something that regularly but > not heavily updates the screen with X up, the guest consumes a > full CPU's worth of processing power, yet when that updating > doesn't happen, CPU consumption goes down, and it goes further > down when shutting down X altogether - at least as log as the > patch here doesn't get involved). > This I'm observing on a Westmere box (and I didn't notice it earlier > because that's one of those where due to a chipset erratum the > IOMMU gets turned off by default), so it's possible that this can't > be seen on more modern hardware. I'll hopefully find time today to > check this on the one newer (Sandy Bridge) box I have. Just got done with trying this: By default, things work fine there. As soon as I use "iommu=no-snoop", things go bad (even worse than one the older box - the guest is consuming about 2.5 CPUs worth of processing power _without_ the patch here in use, so I don't even want to think about trying it there); I guessed that to be another of the potential sources of the problem since on that older box the respective hardware feature is unavailable. While I'll try to look into this further, I guess I have to defer to our VT-d specialists at Intel at this point... Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "Tim Deegan" <tim@xen.org>, "Yang Z Zhang" <yang.z.zhang@intel.com>, "George Dunlap" <george.dunlap@eu.citrix.com> Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Xiantao Zhang <xiantao.zhang@intel.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Mon, 17 Feb 2014 15:05:01 +0000 Message-ID: <5302332D020000780011CEF1@nat28.tlf.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 17.02.14 at 15:51, George Dunlap <george.dunlap@eu.citrix.com> wrote: > On 02/17/2014 10:18 AM, Jan Beulich wrote: >> Actually I'm afraid there are two problems with this patch: >> >> For one, is enabling "global" log dirty mode still going to work >> after VRAM-only mode already got enabled? I ask because the >> paging_mode_log_dirty() check which paging_log_dirty_enable() >> does first thing suggests otherwise to me (i.e. the now >> conditional setting of all p2m entries to p2m_ram_logdirty would >> seem to never get executed). IOW I would think that we're now >> lacking a control operation allowing the transition from dirty VRAM >> tracking mode to full log dirty mode. > > Hrm, will so far playing with this I've been unable to get a localhost > migrate to fail with the vncviewer attached. Which seems a bit strange... Not necessarily - it may depend on how the tools actually do this: They might temporarily disable log dirty mode altogether, just to re-enable full mode again right away. But this specific usage of the hypervisor interface wouldn't (to me) mean that other tool stacks might not be doing this differently. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Zhang, Yang Z" <yang.z.zhang@intel.com> To: Jan Beulich <JBeulich@suse.com>, George Dunlap <george.dunlap@eu.citrix.com>, Tim Deegan <tim@xen.org> Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, "Dugger, Donald D" <donald.d.dugger@intel.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Tue, 18 Feb 2014 03:14:47 +0000 Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0A9ED8B0@SHSMSX104.ccr.corp.intel.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
Jan Beulich wrote on 2014-02-17: >>>> On 17.02.14 at 15:51, George Dunlap <george.dunlap@eu.citrix.com> > wrote: >> On 02/17/2014 10:18 AM, Jan Beulich wrote: >>> Actually I'm afraid there are two problems with this patch: >>> >>> For one, is enabling "global" log dirty mode still going to work >>> after VRAM-only mode already got enabled? I ask because the >>> paging_mode_log_dirty() check which paging_log_dirty_enable() does >>> first thing suggests otherwise to me (i.e. the now conditional >>> setting of all p2m entries to p2m_ram_logdirty would seem to never >>> get executed). IOW I would think that we're now lacking a control >>> operation allowing the transition from dirty VRAM tracking mode to >>> full log dirty mode. >> >> Hrm, will so far playing with this I've been unable to get a >> localhost migrate to fail with the vncviewer attached. Which seems a bit strange... > > Not necessarily - it may depend on how the tools actually do this: > They might temporarily disable log dirty mode altogether, just to > re-enable full mode again right away. But this specific usage of the > hypervisor interface wouldn't (to me) mean that other tool stacks > might not be doing this differently. You are right. Before migration, libxc will disable log dirty mode if it already enabled before and re-enable it. So when I am developing this patch, I think it is ok for migration. If there really have other tool stacks also will use this interface (Is it true?), perhaps my original patch is better which will check paging_mode_log_dirty(d) && log_global: diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c index ab5eacb..368c975 100644 --- a/xen/arch/x86/mm/paging.c +++ b/xen/arch/x86/mm/paging.c @@ -168,7 +168,7 @@ int paging_log_dirty_enable(struct domain *d, bool_t log_global) { int ret; - if ( paging_mode_log_dirty(d) ) + if ( paging_mode_log_dirty(d) && !log_global ) return -EINVAL; domain_pause(d); > > Jan Best regards, Yang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Zhang, Yang Z" <yang.z.zhang@intel.com> To: George Dunlap <george.dunlap@eu.citrix.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, Jan Beulich <JBeulich@suse.com> Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "Dugger, Donald D" <donald.d.dugger@intel.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Tue, 18 Feb 2014 03:25:21 +0000 Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0A9ED8D3@SHSMSX104.ccr.corp.intel.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
Jan Beulich wrote on 2014-02-17: >>>> On 17.02.14 at 11:18, "Jan Beulich" <JBeulich@suse.com> wrote: >> And second, I have been fighting with finding both conditions and >> (eventually) the root cause of a severe performance regression >> (compared to 4.3.x) I'm observing on an EPT+IOMMU system. This >> became _much_ worse after adding in the patch here (while in fact I >> had hoped it might help with the originally observed >> degradation): X startup fails due to timing out, and booting the >> guest now takes about 20 minutes). I didn't find the root cause of >> this yet, but meanwhile I know that >> - the same isn't observable on SVM >> - there's no problem when forcing the domain to use shadow >> mode - there's no need for any device to actually be assigned to the >> guest - the regression is very likely purely graphics related (based >> on the observation that when running something that regularly but not >> heavily updates the screen with X up, the guest consumes a full CPU's >> worth of processing power, yet when that updating doesn't happen, CPU >> consumption goes down, and it goes further down when shutting down X >> altogether - at least as log as the patch here doesn't get involved). >> This I'm observing on a Westmere box (and I didn't notice it earlier >> because that's one of those where due to a chipset erratum the IOMMU >> gets turned off by default), so it's possible that this can't be >> seen on more modern hardware. I'll hopefully find time today to >> check this on the one newer (Sandy Bridge) box I have. > > Just got done with trying this: By default, things work fine there. > As soon as I use "iommu=no-snoop", things go bad (even worse than one > the older box - the guest is consuming about 2.5 CPUs worth of > processing power _without_ the patch here in use, so I don't even want > to think about trying it there); I guessed that to be another of the > potential sources of the problem since on that older box the respective hardware feature is unavailable. > > While I'll try to look into this further, I guess I have to defer to > our VT-d specialists at Intel at this point... > Hi, Jan, I tried to reproduce it. But unfortunately, I cannot reproduce it in my box (sandy bridge EP)with latest Xen(include my patch). I guess my configuration or steps may wrong, here is mine: 1. add iommu=1,no-snoop in by xen cmd line: (XEN) Intel VT-d Snoop Control not enabled. (XEN) Intel VT-d Dom0 DMA Passthrough not enabled. (XEN) Intel VT-d Queued Invalidation enabled. (XEN) Intel VT-d Interrupt Remapping enabled. (XEN) Intel VT-d Shared EPT tables enabled. 2. boot a rhel6u4 guest. 3. after guest boot up, run startx inside guest. 4. a few second, the X windows shows and didn't see any error. Also the CPU utilization is about 1.7%. Any thing wrong? > Jan Best regards, Yang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "Xiantao Zhang" <xiantao.zhang@intel.com>, "Yang Z Zhang" <yang.z.zhang@intel.com> Cc: George Dunlap <george.dunlap@eu.citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Donald D Dugger <donald.d.dugger@intel.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Tue, 18 Feb 2014 08:30:56 +0000 Message-ID: <53032850020000780011D1FD@nat28.tlf.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 17.02.14 at 16:00, "Jan Beulich" <JBeulich@suse.com> wrote: >>>> On 17.02.14 at 11:18, "Jan Beulich" <JBeulich@suse.com> wrote: >> This I'm observing on a Westmere box (and I didn't notice it earlier >> because that's one of those where due to a chipset erratum the >> IOMMU gets turned off by default), so it's possible that this can't >> be seen on more modern hardware. I'll hopefully find time today to >> check this on the one newer (Sandy Bridge) box I have. > > Just got done with trying this: By default, things work fine there. > As soon as I use "iommu=no-snoop", things go bad (even worse > than one the older box - the guest is consuming about 2.5 CPUs > worth of processing power _without_ the patch here in use, so I > don't even want to think about trying it there); I guessed that to > be another of the potential sources of the problem since on that > older box the respective hardware feature is unavailable. That wasn't a fair comparison: The guest here had an SR-IOV NIC assigned. Once removing that, badness goes back to about the level I observe on the Westmere box. I'm then relatively certain that this "extra" badness can be attributed to the excessive use of wbinvd. Which of course puts under question what good assigning a device does on a system without snoop control, if overall performance gets worse rather than better. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "Yang Z Zhang" <yang.z.zhang@intel.com>, "Xiantao Zhang" <xiantao.zhang@intel.com> Cc: Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, DonaldD Dugger <donald.d.dugger@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, George Dunlap <george.dunlap@eu.citrix.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Tue, 18 Feb 2014 08:45:31 +0000 Message-ID: <53032BBB020000780011D214@nat28.tlf.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 18.02.14 at 04:25, "Zhang, Yang Z" <yang.z.zhang@intel.com> wrote: > I tried to reproduce it. But unfortunately, I cannot reproduce it in my box > (sandy bridge EP)with latest Xen(include my patch). I guess my configuration > or steps may wrong, here is mine: > > 1. add iommu=1,no-snoop in by xen cmd line: > (XEN) Intel VT-d Snoop Control not enabled. > (XEN) Intel VT-d Dom0 DMA Passthrough not enabled. > (XEN) Intel VT-d Queued Invalidation enabled. > (XEN) Intel VT-d Interrupt Remapping enabled. > (XEN) Intel VT-d Shared EPT tables enabled. > > 2. boot a rhel6u4 guest. > > 3. after guest boot up, run startx inside guest. > > 4. a few second, the X windows shows and didn't see any error. Also the CPU > utilization is about 1.7%. > > Any thing wrong? Nothing I can see. The main difference might be that I'm using a SLES11 SP3 guest instead of a RHEL one. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: George Dunlap <george.dunlap@eu.citrix.com> To: Jan Beulich <JBeulich@suse.com>, "Zhang, Yang Z" <yang.z.zhang@intel.com>, Tim Deegan <tim@xen.org> Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, "Dugger, Donald D" <donald.d.dugger@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Tue, 18 Feb 2014 10:26:12 +0000 Message-ID: <53033544.2000409@eu.citrix.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
On 02/18/2014 03:14 AM, Zhang, Yang Z wrote: > Jan Beulich wrote on 2014-02-17: >>>>> On 17.02.14 at 15:51, George Dunlap <george.dunlap@eu.citrix.com> >> wrote: >>> On 02/17/2014 10:18 AM, Jan Beulich wrote: >>>> Actually I'm afraid there are two problems with this patch: >>>> >>>> For one, is enabling "global" log dirty mode still going to work >>>> after VRAM-only mode already got enabled? I ask because the >>>> paging_mode_log_dirty() check which paging_log_dirty_enable() does >>>> first thing suggests otherwise to me (i.e. the now conditional >>>> setting of all p2m entries to p2m_ram_logdirty would seem to never >>>> get executed). IOW I would think that we're now lacking a control >>>> operation allowing the transition from dirty VRAM tracking mode to >>>> full log dirty mode. >>> Hrm, will so far playing with this I've been unable to get a >>> localhost migrate to fail with the vncviewer attached. Which seems a bit strange... >> Not necessarily - it may depend on how the tools actually do this: >> They might temporarily disable log dirty mode altogether, just to >> re-enable full mode again right away. But this specific usage of the >> hypervisor interface wouldn't (to me) mean that other tool stacks >> might not be doing this differently. > You are right. Before migration, libxc will disable log dirty mode if it already enabled before and re-enable it. So when I am developing this patch, I think it is ok for migration. > > If there really have other tool stacks also will use this interface (Is it true?), perhaps my original patch is better which will check paging_mode_log_dirty(d) && log_global: It turns out that the reason I couldn't get a crash was because libxc was actually paying attention to the -EINVAL return value, and disabling and then re-enabling logdirty. That's what would happen before your dirty vram patch, and that's what happens after. And arguably, that's the correct behavior for any toolstack, given that the interface returns an error. This patch would actually change the interface; if we check this in, then if you enable logdirty when dirty vram tracking is enabled, you *won't* get an error, and thus *won't* disable and re-enable logdirty mode. So actually, this patch would be more disruptive. Thanks for the patch, though. :-) -George > > diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c > index ab5eacb..368c975 100644 > --- a/xen/arch/x86/mm/paging.c > +++ b/xen/arch/x86/mm/paging.c > @@ -168,7 +168,7 @@ int paging_log_dirty_enable(struct domain *d, bool_t log_global) > { > int ret; > > - if ( paging_mode_log_dirty(d) ) > + if ( paging_mode_log_dirty(d) && !log_global ) > return -EINVAL; > > domain_pause(d); > > >> Jan > > Best regards, > Yang > > _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "Yang Z Zhang" <yang.z.zhang@intel.com>, "George Dunlap" <george.dunlap@eu.citrix.com>, "Dongxiao Xu" <dongxiao.xu@intel.com>, "Xiantao Zhang" <xiantao.zhang@intel.com> Cc: DonaldD Dugger <donald.d.dugger@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Tue, 18 Feb 2014 11:46:32 +0000 Message-ID: <53035628020000780011D3EE@nat28.tlf.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 18.02.14 at 04:25, "Zhang, Yang Z" <yang.z.zhang@intel.com> wrote: > Jan Beulich wrote on 2014-02-17: >>>>> On 17.02.14 at 11:18, "Jan Beulich" <JBeulich@suse.com> wrote: >>> And second, I have been fighting with finding both conditions and >>> (eventually) the root cause of a severe performance regression >>> (compared to 4.3.x) I'm observing on an EPT+IOMMU system. This >>> became _much_ worse after adding in the patch here (while in fact I >>> had hoped it might help with the originally observed >>> degradation): X startup fails due to timing out, and booting the >>> guest now takes about 20 minutes). I didn't find the root cause of >>> this yet, but meanwhile I know that >>> - the same isn't observable on SVM >>> - there's no problem when forcing the domain to use shadow >>> mode - there's no need for any device to actually be assigned to the >>> guest - the regression is very likely purely graphics related (based >>> on the observation that when running something that regularly but not >>> heavily updates the screen with X up, the guest consumes a full CPU's >>> worth of processing power, yet when that updating doesn't happen, CPU >>> consumption goes down, and it goes further down when shutting down X >>> altogether - at least as log as the patch here doesn't get involved). >>> This I'm observing on a Westmere box (and I didn't notice it earlier >>> because that's one of those where due to a chipset erratum the IOMMU >>> gets turned off by default), so it's possible that this can't be >>> seen on more modern hardware. I'll hopefully find time today to >>> check this on the one newer (Sandy Bridge) box I have. >> >> Just got done with trying this: By default, things work fine there. >> As soon as I use "iommu=no-snoop", things go bad (even worse than one >> the older box - the guest is consuming about 2.5 CPUs worth of >> processing power _without_ the patch here in use, so I don't even want >> to think about trying it there); I guessed that to be another of the >> potential sources of the problem since on that older box the respective > hardware feature is unavailable. >> >> While I'll try to look into this further, I guess I have to defer to >> our VT-d specialists at Intel at this point... >> > > Hi, Jan, > > I tried to reproduce it. But unfortunately, I cannot reproduce it in my box > (sandy bridge EP)with latest Xen(include my patch). I guess my configuration > or steps may wrong, here is mine: > > 1. add iommu=1,no-snoop in by xen cmd line: > (XEN) Intel VT-d Snoop Control not enabled. > (XEN) Intel VT-d Dom0 DMA Passthrough not enabled. > (XEN) Intel VT-d Queued Invalidation enabled. > (XEN) Intel VT-d Interrupt Remapping enabled. > (XEN) Intel VT-d Shared EPT tables enabled. > > 2. boot a rhel6u4 guest. > > 3. after guest boot up, run startx inside guest. > > 4. a few second, the X windows shows and didn't see any error. Also the CPU > utilization is about 1.7%. > > Any thing wrong? Nothing at all, as it turns out. The regression is due to Dongxiao's http://lists.xenproject.org/archives/html/xen-devel/2013-12/msg00367.html which I have in my tree as part of various things pending for 4.5. And which at the first, second, and third glance looks pretty innocent (IOW I still have to find out _why_ it is wrong). In any case - I'm very sorry for the false alarm. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "Dongxiao Xu" <dongxiao.xu@intel.com>, "Yang Z Zhang" <yang.z.zhang@intel.com>, "Xiantao Zhang" <xiantao.zhang@intel.com> Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, George Dunlap <george.dunlap@eu.citrix.com>, DonaldD Dugger <donald.d.dugger@intel.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Tue, 18 Feb 2014 15:28:35 +0000 Message-ID: <53038A33020000780011D5FD@nat28.tlf.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
[Part 1 (text/plain, inline)]
>>> On 18.02.14 at 12:46, "Jan Beulich" <JBeulich@suse.com> wrote: > Nothing at all, as it turns out. The regression is due to Dongxiao's > > http://lists.xenproject.org/archives/html/xen-devel/2013-12/msg00367.html > > which I have in my tree as part of various things pending for 4.5. > And which at the first, second, and third glance looks pretty > innocent (IOW I still have to find out _why_ it is wrong). And here's a fixed version of the patch - we simply can't drop the bogus HVM_PARAM_IDENT_PT check entirely yet. In the course of fixing this I also found two other shortcomings: - EPT EMT field should be updated upon guest MTRR writes (the lack thereof is the reason for needing to retain the bogus check) - epte_get_entry_emt() either needs "order" passed, or its callers must call it more than once for big/huge pages Jan x86/hvm: refine the judgment on IDENT_PT for EMT When trying to get the EPT EMT type, the judgment on HVM_PARAM_IDENT_PT is not correct which always returns WB type if the parameter is not set. Remove the related code. Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> We can't fully drop the dependency yet, but we should certainly avoid overriding cases already properly handled. The reason for this is that the guest setting up its MTRRs happens _after_ the EPT tables got already constructed, and no code is in place to propagate this to the EPT code. Without this check we're forcing the guest to run with all of its memory uncachable until something happens to re-write every single EPT entry. But of course this has to be just a temporary solution. In the same spirit we should defer the "very early" (when the guest is still being constructed and has no vCPU yet) override to the last possible point. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/hvm/mtrr.c +++ b/xen/arch/x86/hvm/mtrr.c @@ -689,13 +689,8 @@ uint8_t epte_get_entry_emt(struct domain *ipat = 0; - if ( (current->domain != d) && - ((d->vcpu == NULL) || ((v = d->vcpu[0]) == NULL)) ) - return MTRR_TYPE_WRBACK; - - if ( !is_pvh_vcpu(v) && - !v->domain->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] ) - return MTRR_TYPE_WRBACK; + if ( v->domain != d ) + v = d->vcpu ? d->vcpu[0] : NULL; if ( !mfn_valid(mfn_x(mfn)) ) return MTRR_TYPE_UNCACHABLE; @@ -718,7 +713,8 @@ uint8_t epte_get_entry_emt(struct domain return MTRR_TYPE_WRBACK; } - gmtrr_mtype = is_hvm_vcpu(v) ? + gmtrr_mtype = is_hvm_domain(d) && v && + d->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] ? get_mtrr_type(&v->arch.hvm_vcpu.mtrr, (gfn << PAGE_SHIFT)) : MTRR_TYPE_WRBACK;
[EPT-ignore-IDENT_PT (application/octet-stream, attachment)]
[Part 3 (text/plain, inline)]
From: "Zhang, Yang Z" <yang.z.zhang@intel.com> To: Jan Beulich <JBeulich@suse.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, "Xu, Dongxiao" <dongxiao.xu@intel.com>, George Dunlap <george.dunlap@eu.citrix.com> Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "Dugger, Donald D" <donald.d.dugger@intel.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Wed, 19 Feb 2014 01:17:08 +0000 Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0A9EEB9B@SHSMSX104.ccr.corp.intel.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
Jan Beulich wrote on 2014-02-18: >>>> On 18.02.14 at 04:25, "Zhang, Yang Z" <yang.z.zhang@intel.com> wrote: >> Jan Beulich wrote on 2014-02-17: >>>>>> On 17.02.14 at 11:18, "Jan Beulich" <JBeulich@suse.com> wrote: >>>> And second, I have been fighting with finding both conditions and >>>> (eventually) the root cause of a severe performance regression >>>> (compared to 4.3.x) I'm observing on an EPT+IOMMU system. This >>>> became _much_ worse after adding in the patch here (while in fact >>>> I had hoped it might help with the originally observed >>>> degradation): X startup fails due to timing out, and booting the >>>> guest now takes about 20 minutes). I didn't find the root cause of >>>> this yet, but meanwhile I know that >>>> - the same isn't observable on SVM >>>> - there's no problem when forcing the domain to use shadow >>>> mode - there's no need for any device to actually be assigned to the >>>> guest - the regression is very likely purely graphics related (based >>>> on the observation that when running something that regularly but not >>>> heavily updates the screen with X up, the guest consumes a full CPU's >>>> worth of processing power, yet when that updating doesn't >>>> happen, > CPU >>>> consumption goes down, and it goes further down when shutting >>>> down > X >>>> altogether - at least as log as the patch here doesn't get involved). >>>> This I'm observing on a Westmere box (and I didn't notice it >>>> earlier because that's one of those where due to a chipset erratum >>>> the IOMMU gets turned off by default), so it's possible that this >>>> can't be seen on more modern hardware. I'll hopefully find time >>>> today to check this on the one newer (Sandy Bridge) box I have. >>> >>> Just got done with trying this: By default, things work fine there. As >>> soon as I use "iommu=no-snoop", things go bad (even worse than one the >>> older box - the guest is consuming about 2.5 CPUs worth of processing >>> power _without_ the patch here in use, so I don't even want to think >>> about trying it there); I guessed that to be another of the potential >>> sources of the problem since on that older box the respective hardware >>> feature is unavailable. >>> >>> While I'll try to look into this further, I guess I have to defer >>> to our VT-d specialists at Intel at this point... >>> >> >> Hi, Jan, >> >> I tried to reproduce it. But unfortunately, I cannot reproduce it in >> my box (sandy bridge EP)with latest Xen(include my patch). I guess >> my configuration or steps may wrong, here is mine: >> >> 1. add iommu=1,no-snoop in by xen cmd line: >> (XEN) Intel VT-d Snoop Control not enabled. >> (XEN) Intel VT-d Dom0 DMA Passthrough not enabled. >> (XEN) Intel VT-d Queued Invalidation enabled. >> (XEN) Intel VT-d Interrupt Remapping enabled. >> (XEN) Intel VT-d Shared EPT tables enabled. >> >> 2. boot a rhel6u4 guest. >> >> 3. after guest boot up, run startx inside guest. >> >> 4. a few second, the X windows shows and didn't see any error. Also >> the CPU utilization is about 1.7%. >> >> Any thing wrong? > > Nothing at all, as it turns out. The regression is due to Dongxiao's > > http://lists.xenproject.org/archives/html/xen-devel/2013-12/msg00367.h > tml > > which I have in my tree as part of various things pending for 4.5. > And which at the first, second, and third glance looks pretty innocent > (IOW I still have to find out _why_ it is wrong). > > In any case - I'm very sorry for the false alarm. > It doesn't matter. Conversely, we need to thank you for helping us to fix this issue. :) BTW, I still cannot reproduce it in my box, even I uses SLES 11 SP3 as guest. > Jan Best regards, Yang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Zhang, Yang Z" <yang.z.zhang@intel.com> To: Tim Deegan <tim@xen.org>, Jan Beulich <JBeulich@suse.com>, George Dunlap <george.dunlap@eu.citrix.com> Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, "Dugger, Donald D" <donald.d.dugger@intel.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Wed, 19 Feb 2014 01:28:59 +0000 Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0A9EEBC9@SHSMSX104.ccr.corp.intel.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
George Dunlap wrote on 2014-02-18: > On 02/18/2014 03:14 AM, Zhang, Yang Z wrote: >> Jan Beulich wrote on 2014-02-17: >>>>>> On 17.02.14 at 15:51, George Dunlap >>>>>> <george.dunlap@eu.citrix.com> >>> wrote: >>>> On 02/17/2014 10:18 AM, Jan Beulich wrote: >>>>> Actually I'm afraid there are two problems with this patch: >>>>> >>>>> For one, is enabling "global" log dirty mode still going to work >>>>> after VRAM-only mode already got enabled? I ask because the >>>>> paging_mode_log_dirty() check which paging_log_dirty_enable() >>>>> does first thing suggests otherwise to me (i.e. the now >>>>> conditional setting of all p2m entries to p2m_ram_logdirty would >>>>> seem to never get executed). IOW I would think that we're now >>>>> lacking a control operation allowing the transition from dirty >>>>> VRAM tracking mode to full log dirty mode. >>>> Hrm, will so far playing with this I've been unable to get a >>>> localhost migrate to fail with the vncviewer attached. Which >>>> seems a bit > strange... >>> Not necessarily - it may depend on how the tools actually do this: >>> They might temporarily disable log dirty mode altogether, just to >>> re-enable full mode again right away. But this specific usage of >>> the hypervisor interface wouldn't (to me) mean that other tool >>> stacks might not be doing this differently. >> You are right. Before migration, libxc will disable log dirty mode >> if it already > enabled before and re-enable it. So when I am developing this patch, I > think it is ok for migration. >> >> If there really have other tool stacks also will use this interface >> (Is it true?), > perhaps my original patch is better which will check > paging_mode_log_dirty(d) && log_global: > > It turns out that the reason I couldn't get a crash was because libxc > was actually paying attention to the -EINVAL return value, and > disabling and then re-enabling logdirty. That's what would happen > before your dirty vram patch, and that's what happens after. And > arguably, that's the correct behavior for any toolstack, given that the interface returns an error. Agree. > > This patch would actually change the interface; if we check this in, > then if you enable logdirty when dirty vram tracking is enabled, you > *won't* get an error, and thus *won't* disable and re-enable logdirty mode. > So actually, this patch would be more disruptive. > Jan, do you have any comment? > Thanks for the patch, though. :-) > > -George >> >> diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c >> index >> ab5eacb..368c975 100644 >> --- a/xen/arch/x86/mm/paging.c >> +++ b/xen/arch/x86/mm/paging.c >> @@ -168,7 +168,7 @@ int paging_log_dirty_enable(struct domain *d, > bool_t log_global) >> { >> int ret; >> - if ( paging_mode_log_dirty(d) ) >> + if ( paging_mode_log_dirty(d) && !log_global ) >> return -EINVAL; >> domain_pause(d); >> >>> Jan >> >> Best regards, >> Yang >> >> Best regards, Yang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Xu, Dongxiao" <dongxiao.xu@intel.com> To: "Zhang, Xiantao" <xiantao.zhang@intel.com>, Jan Beulich <JBeulich@suse.com>, "Zhang, Yang Z" <yang.z.zhang@intel.com> Cc: "Dugger, Donald D" <donald.d.dugger@intel.com>, George Dunlap <george.dunlap@eu.citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Wed, 19 Feb 2014 06:40:54 +0000 Message-ID: <40776A41FC278F40B59438AD47D147A91194BB6B@SHSMSX104.ccr.corp.intel.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: Tuesday, February 18, 2014 11:29 PM > To: Xu, Dongxiao; Zhang, Xiantao; Zhang, Yang Z > Cc: andrew.cooper3@citrix.com; George Dunlap; Dugger, Donald D; > xen-devel@lists.xen.org; Tim Deegan > Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty > to track vram > > >>> On 18.02.14 at 12:46, "Jan Beulich" <JBeulich@suse.com> wrote: > > Nothing at all, as it turns out. The regression is due to Dongxiao's > > > > http://lists.xenproject.org/archives/html/xen-devel/2013-12/msg00367.html > > > > which I have in my tree as part of various things pending for 4.5. > > And which at the first, second, and third glance looks pretty > > innocent (IOW I still have to find out _why_ it is wrong). > > And here's a fixed version of the patch - we simply can't drop > the bogus HVM_PARAM_IDENT_PT check entirely yet. > > In the course of fixing this I also found two other shortcomings: > - EPT EMT field should be updated upon guest MTRR writes (the > lack thereof is the reason for needing to retain the bogus check) > - epte_get_entry_emt() either needs "order" passed, or its callers > must call it more than once for big/huge pages > The below patch looks okay to me. Thanks! Thanks, Dongxiao > Jan > > x86/hvm: refine the judgment on IDENT_PT for EMT > > When trying to get the EPT EMT type, the judgment on > HVM_PARAM_IDENT_PT is not correct which always returns WB type if > the parameter is not set. Remove the related code. > > Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> > > We can't fully drop the dependency yet, but we should certainly avoid > overriding cases already properly handled. The reason for this is that > the guest setting up its MTRRs happens _after_ the EPT tables got > already constructed, and no code is in place to propagate this to the > EPT code. Without this check we're forcing the guest to run with all of > its memory uncachable until something happens to re-write every single > EPT entry. But of course this has to be just a temporary solution. > > In the same spirit we should defer the "very early" (when the guest is > still being constructed and has no vCPU yet) override to the last > possible point. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > > --- a/xen/arch/x86/hvm/mtrr.c > +++ b/xen/arch/x86/hvm/mtrr.c > @@ -689,13 +689,8 @@ uint8_t epte_get_entry_emt(struct domain > > *ipat = 0; > > - if ( (current->domain != d) && > - ((d->vcpu == NULL) || ((v = d->vcpu[0]) == NULL)) ) > - return MTRR_TYPE_WRBACK; > - > - if ( !is_pvh_vcpu(v) && > - !v->domain->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] ) > - return MTRR_TYPE_WRBACK; > + if ( v->domain != d ) > + v = d->vcpu ? d->vcpu[0] : NULL; > > if ( !mfn_valid(mfn_x(mfn)) ) > return MTRR_TYPE_UNCACHABLE; > @@ -718,7 +713,8 @@ uint8_t epte_get_entry_emt(struct domain > return MTRR_TYPE_WRBACK; > } > > - gmtrr_mtype = is_hvm_vcpu(v) ? > + gmtrr_mtype = is_hvm_domain(d) && v && > + d->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] ? > get_mtrr_type(&v->arch.hvm_vcpu.mtrr, (gfn << > PAGE_SHIFT)) : > MTRR_TYPE_WRBACK; > > > > _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "Yang Z Zhang" <yang.z.zhang@intel.com> Cc: Dongxiao Xu <dongxiao.xu@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>, George Dunlap <george.dunlap@eu.citrix.com>, DonaldD Dugger <donald.d.dugger@intel.com>, Xiantao Zhang <xiantao.zhang@intel.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Wed, 19 Feb 2014 08:50:55 +0000 Message-ID: <53047E7F020000780011D8E1@nat28.tlf.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 19.02.14 at 02:17, "Zhang, Yang Z" <yang.z.zhang@intel.com> wrote: > Jan Beulich wrote on 2014-02-18: >>>>> On 18.02.14 at 04:25, "Zhang, Yang Z" <yang.z.zhang@intel.com> wrote: >>> Jan Beulich wrote on 2014-02-17: >>>>>>> On 17.02.14 at 11:18, "Jan Beulich" <JBeulich@suse.com> wrote: >>>>> And second, I have been fighting with finding both conditions and >>>>> (eventually) the root cause of a severe performance regression >>>>> (compared to 4.3.x) I'm observing on an EPT+IOMMU system. This >>>>> became _much_ worse after adding in the patch here (while in fact >>>>> I had hoped it might help with the originally observed >>>>> degradation): X startup fails due to timing out, and booting the >>>>> guest now takes about 20 minutes). I didn't find the root cause of >>>>> this yet, but meanwhile I know that >>>>> - the same isn't observable on SVM >>>>> - there's no problem when forcing the domain to use shadow >>>>> mode - there's no need for any device to actually be assigned to the >>>>> guest - the regression is very likely purely graphics related (based >>>>> on the observation that when running something that regularly but not >>>>> heavily updates the screen with X up, the guest consumes a full CPU's >>>>> worth of processing power, yet when that updating doesn't >>>>> happen, >> CPU >>>>> consumption goes down, and it goes further down when shutting >>>>> down >> X >>>>> altogether - at least as log as the patch here doesn't get involved). >>>>> This I'm observing on a Westmere box (and I didn't notice it >>>>> earlier because that's one of those where due to a chipset erratum >>>>> the IOMMU gets turned off by default), so it's possible that this >>>>> can't be seen on more modern hardware. I'll hopefully find time >>>>> today to check this on the one newer (Sandy Bridge) box I have. >>>> >>>> Just got done with trying this: By default, things work fine there. As >>>> soon as I use "iommu=no-snoop", things go bad (even worse than one the >>>> older box - the guest is consuming about 2.5 CPUs worth of processing >>>> power _without_ the patch here in use, so I don't even want to think >>>> about trying it there); I guessed that to be another of the potential >>>> sources of the problem since on that older box the respective hardware >>>> feature is unavailable. >>>> >>>> While I'll try to look into this further, I guess I have to defer >>>> to our VT-d specialists at Intel at this point... >>>> >>> >>> Hi, Jan, >>> >>> I tried to reproduce it. But unfortunately, I cannot reproduce it in >>> my box (sandy bridge EP)with latest Xen(include my patch). I guess >>> my configuration or steps may wrong, here is mine: >>> >>> 1. add iommu=1,no-snoop in by xen cmd line: >>> (XEN) Intel VT-d Snoop Control not enabled. >>> (XEN) Intel VT-d Dom0 DMA Passthrough not enabled. >>> (XEN) Intel VT-d Queued Invalidation enabled. >>> (XEN) Intel VT-d Interrupt Remapping enabled. >>> (XEN) Intel VT-d Shared EPT tables enabled. >>> >>> 2. boot a rhel6u4 guest. >>> >>> 3. after guest boot up, run startx inside guest. >>> >>> 4. a few second, the X windows shows and didn't see any error. Also >>> the CPU utilization is about 1.7%. >>> >>> Any thing wrong? >> >> Nothing at all, as it turns out. The regression is due to Dongxiao's >> >> http://lists.xenproject.org/archives/html/xen-devel/2013-12/msg00367.h >> tml >> >> which I have in my tree as part of various things pending for 4.5. >> And which at the first, second, and third glance looks pretty innocent >> (IOW I still have to find out _why_ it is wrong). >> >> In any case - I'm very sorry for the false alarm. >> > > It doesn't matter. Conversely, we need to thank you for helping us to fix > this issue. :) > > BTW, I still cannot reproduce it in my box, even I uses SLES 11 SP3 as > guest. I assume you didn't pull in the broken patch - I'm sure you would see the problem if you did. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "Yang Z Zhang" <yang.z.zhang@intel.com> Cc: Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, Donald D Dugger <donald.d.dugger@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, George Dunlap <george.dunlap@eu.citrix.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Wed, 19 Feb 2014 08:55:00 +0000 Message-ID: <53047F74020000780011D8E4@nat28.tlf.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 19.02.14 at 02:28, "Zhang, Yang Z" <yang.z.zhang@intel.com> wrote: > George Dunlap wrote on 2014-02-18: >> On 02/18/2014 03:14 AM, Zhang, Yang Z wrote: >> perhaps my original patch is better which will check >> paging_mode_log_dirty(d) && log_global: >> >> It turns out that the reason I couldn't get a crash was because libxc >> was actually paying attention to the -EINVAL return value, and >> disabling and then re-enabling logdirty. That's what would happen >> before your dirty vram patch, and that's what happens after. And >> arguably, that's the correct behavior for any toolstack, given that the > interface returns an error. > > Agree. > >> >> This patch would actually change the interface; if we check this in, >> then if you enable logdirty when dirty vram tracking is enabled, you >> *won't* get an error, and thus *won't* disable and re-enable logdirty mode. >> So actually, this patch would be more disruptive. >> > > Jan, do you have any comment? This simplistic variant is just calling for problems. As was already said elsewhere on this thread, we should simply do the mode change properly: Track that a partial log-dirty mode is in use, and allow switching to global log-dirty mode (converting all entries to R/O). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: George Dunlap <george.dunlap@eu.citrix.com> To: Yang Z Zhang <yang.z.zhang@intel.com>, Jan Beulich <JBeulich@suse.com> Cc: Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, Donald D Dugger <donald.d.dugger@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Wed, 19 Feb 2014 11:03:50 +0000 Message-ID: <53048F96.7010503@eu.citrix.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
On 02/19/2014 08:55 AM, Jan Beulich wrote: >>>> On 19.02.14 at 02:28, "Zhang, Yang Z" <yang.z.zhang@intel.com> wrote: >> George Dunlap wrote on 2014-02-18: >>> On 02/18/2014 03:14 AM, Zhang, Yang Z wrote: >>> perhaps my original patch is better which will check >>> paging_mode_log_dirty(d) && log_global: >>> >>> It turns out that the reason I couldn't get a crash was because libxc >>> was actually paying attention to the -EINVAL return value, and >>> disabling and then re-enabling logdirty. That's what would happen >>> before your dirty vram patch, and that's what happens after. And >>> arguably, that's the correct behavior for any toolstack, given that the >> interface returns an error. >> >> Agree. >> >>> This patch would actually change the interface; if we check this in, >>> then if you enable logdirty when dirty vram tracking is enabled, you >>> *won't* get an error, and thus *won't* disable and re-enable logdirty mode. >>> So actually, this patch would be more disruptive. >>> >> Jan, do you have any comment? > This simplistic variant is just calling for problems. As was already > said elsewhere on this thread, we should simply do the mode change > properly: Track that a partial log-dirty mode is in use, and allow > switching to global log-dirty mode (converting all entries to R/O). I think Yang was asking you for your opinion on my suggestion that nothing actually needed to be done. Enabling full logdirty mode for migration when dirty vram tracking was enabled has *always* returned an error (or at least for a long time now), and *always* resulted in the toolstack disabling and re-enabling logdirty mode; Yang's patch doesn't change that at all. If you think that's an interface we need to improve in the future, we can put it on the list of improvements. But at this point it seems to me more like a nice-to-have. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "Yang Z Zhang" <yang.z.zhang@intel.com>, "George Dunlap" <george.dunlap@eu.citrix.com> Cc: Donald D Dugger <donald.d.dugger@intel.com>, Xiantao Zhang <xiantao.zhang@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Wed, 19 Feb 2014 11:13:07 +0000 Message-ID: <53049FD3020000780011DA37@nat28.tlf.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 19.02.14 at 12:03, George Dunlap <george.dunlap@eu.citrix.com> wrote: > On 02/19/2014 08:55 AM, Jan Beulich wrote: >>>>> On 19.02.14 at 02:28, "Zhang, Yang Z" <yang.z.zhang@intel.com> wrote: >>> George Dunlap wrote on 2014-02-18: >>>> On 02/18/2014 03:14 AM, Zhang, Yang Z wrote: >>>> perhaps my original patch is better which will check >>>> paging_mode_log_dirty(d) && log_global: >>>> >>>> It turns out that the reason I couldn't get a crash was because libxc >>>> was actually paying attention to the -EINVAL return value, and >>>> disabling and then re-enabling logdirty. That's what would happen >>>> before your dirty vram patch, and that's what happens after. And >>>> arguably, that's the correct behavior for any toolstack, given that the >>> interface returns an error. >>> >>> Agree. >>> >>>> This patch would actually change the interface; if we check this in, >>>> then if you enable logdirty when dirty vram tracking is enabled, you >>>> *won't* get an error, and thus *won't* disable and re-enable logdirty mode. >>>> So actually, this patch would be more disruptive. >>>> >>> Jan, do you have any comment? >> This simplistic variant is just calling for problems. As was already >> said elsewhere on this thread, we should simply do the mode change >> properly: Track that a partial log-dirty mode is in use, and allow >> switching to global log-dirty mode (converting all entries to R/O). > > I think Yang was asking you for your opinion on my suggestion that > nothing actually needed to be done. Enabling full logdirty mode for > migration when dirty vram tracking was enabled has *always* returned an > error (or at least for a long time now), and *always* resulted in the > toolstack disabling and re-enabling logdirty mode; Yang's patch doesn't > change that at all. > > If you think that's an interface we need to improve in the future, we > can put it on the list of improvements. But at this point it seems to > me more like a nice-to-have. I agree - for 4.4.0 we shouldn't need any further adjustments. And I hoped to imply that I don't see a need for this incremental change to go in by having said "This simplistic variant is just calling for problems". Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: George Dunlap <george.dunlap@eu.citrix.com> To: Jan Beulich <JBeulich@suse.com>, Yang Z Zhang <yang.z.zhang@intel.com> Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Xiantao Zhang <xiantao.zhang@intel.com>, Donald D Dugger <donald.d.dugger@intel.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Wed, 19 Feb 2014 11:17:43 +0000 Message-ID: <530492D7.6050503@eu.citrix.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
On 02/19/2014 11:13 AM, Jan Beulich wrote: >>>> On 19.02.14 at 12:03, George Dunlap <george.dunlap@eu.citrix.com> wrote: >> On 02/19/2014 08:55 AM, Jan Beulich wrote: >>>>>> On 19.02.14 at 02:28, "Zhang, Yang Z" <yang.z.zhang@intel.com> wrote: >>>> George Dunlap wrote on 2014-02-18: >>>>> On 02/18/2014 03:14 AM, Zhang, Yang Z wrote: >>>>> perhaps my original patch is better which will check >>>>> paging_mode_log_dirty(d) && log_global: >>>>> >>>>> It turns out that the reason I couldn't get a crash was because libxc >>>>> was actually paying attention to the -EINVAL return value, and >>>>> disabling and then re-enabling logdirty. That's what would happen >>>>> before your dirty vram patch, and that's what happens after. And >>>>> arguably, that's the correct behavior for any toolstack, given that the >>>> interface returns an error. >>>> >>>> Agree. >>>> >>>>> This patch would actually change the interface; if we check this in, >>>>> then if you enable logdirty when dirty vram tracking is enabled, you >>>>> *won't* get an error, and thus *won't* disable and re-enable logdirty mode. >>>>> So actually, this patch would be more disruptive. >>>>> >>>> Jan, do you have any comment? >>> This simplistic variant is just calling for problems. As was already >>> said elsewhere on this thread, we should simply do the mode change >>> properly: Track that a partial log-dirty mode is in use, and allow >>> switching to global log-dirty mode (converting all entries to R/O). >> I think Yang was asking you for your opinion on my suggestion that >> nothing actually needed to be done. Enabling full logdirty mode for >> migration when dirty vram tracking was enabled has *always* returned an >> error (or at least for a long time now), and *always* resulted in the >> toolstack disabling and re-enabling logdirty mode; Yang's patch doesn't >> change that at all. >> >> If you think that's an interface we need to improve in the future, we >> can put it on the list of improvements. But at this point it seems to >> me more like a nice-to-have. > I agree - for 4.4.0 we shouldn't need any further adjustments. And > I hoped to imply that I don't see a need for this incremental change > to go in by having said "This simplistic variant is just calling for > problems". No, but "we should simply do the mode change properly" could be interpreted as saying, "this needs to be done as a follow-up to the dirty vram tracking patch"; someone might even interpret it as, "you need to do this as a follow-up". That's what I was trying to clarify / express an opinion on. :-) -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Zhang, Yang Z" <yang.z.zhang@intel.com> To: Tim Deegan <tim@xen.org>, Jan Beulich <JBeulich@suse.com> Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, George Dunlap <george.dunlap@eu.citrix.com>, "'Keir Fraser \(keir.xen@gmail.com\)'" <keir.xen@gmail.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "Nakajima, Jun" <jun.nakajima@intel.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Mon, 19 May 2014 07:48:01 +0000 Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0AACCE42@SHSMSX104.ccr.corp.intel.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
Tim Deegan wrote on 2014-02-14: > At 15:55 +0000 on 13 Feb (1392303343), Jan Beulich wrote: >>>>> On 13.02.14 at 16:46, George Dunlap >>>>> <george.dunlap@eu.citrix.com> > wrote: >>> On 02/12/2014 12:53 AM, Zhang, Yang Z wrote: >>>> George Dunlap wrote on 2014-02-11: >>>>> I think I got a bit distracted with the "A isn't really so bad" thing. >>>>> Actually, if the overhead of not sharing tables isn't very high, >>>>> then B isn't such a bad option. In fact, B is what I expected >>>>> Yang to submit when he originally described the problem. >>>> Actually, the first solution came to my mind is B. Then I >>>> realized that even >>> chose B, we still cannot track the memory updating from DMA(even with >>> A/D bit, it still a problem). Also, considering the current usage case >>> of log dirty in Xen(only vram tracking has problem), I though A is >>> better.: Hypervisor only need to track the vram change. If a malicious >>> guest try to DMA to vram range, it only crashed himself (This should be > reasonable). >>>> >>>>> I was going to say, from a release perspective, B is probably >>>>> the safest option for now. But on the other hand, if we've been >>>>> testing sharing all this time, maybe switching back over to >>>>> non-sharing whole-hog has >>> the higher risk? >>>> Another problem with B is that current VT-d large paging >>>> supporting relies on >>> the sharing EPT and VT-d page table. This means if we choose B, >>> then we need to re-enable VT-d large page. This would be a huge >>> performance impaction for Xen 4.4 on using VT-d solution. >>> >>> OK -- if that's the case, then it definitely tips the balance back >>> to A. Unless Tim or Jan disagrees, can one of you two check it in? >>> >>> Don't rush your judgement; but it would be nice to have this in >>> before RC4, which would mean checking it in today preferrably, or >>> early tomorrow at the latest. >> >> That would be Tim then, as he would have to approve of it anyway. > > Done. > >> I should also say that while I certainly understand the >> argumentation above, I would still want to go this route only with >> the promise that B is going to be worked on reasonably soon after >> the release, ideally with the goal of backporting the changes for 4.4.1. > > Agreed. > > Tim. Hi all Sorry to turn out this old thread. Because I just noticed that someone is asking when Intel will implement the VT-d page table separately. Actually, I am totally unaware it. The original issue that this patch tries to fix is the VRAM tracking which using the global log dirty mode. And I thought the best solution to fix it is in VRAM side not VT-d side. Because even use separate VT-d page table, we still cannot track the memory update from DMA. Even worse, I think two page tables introduce redundant code and maintain effort. So I wonder is it really necessary to implement the separate VT-d large page? Best regards, Yang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "Yang Z Zhang" <yang.z.zhang@intel.com> Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Jun Nakajima <jun.nakajima@intel.com>, Keir Fraser <keir@xen.org>, Xiantao Zhang <xiantao.zhang@intel.com>, George Dunlap <george.dunlap@eu.citrix.com>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Mon, 19 May 2014 10:03:35 +0100 Message-ID: <5379E50702000078000136F6@mail.emea.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 19.05.14 at 09:48, <yang.z.zhang@intel.com> wrote: > Because I just noticed that someone is asking when Intel will implement the > VT-d page table separately. Actually, I am totally unaware it. This was a request sent directly to you, so it shouldn't be a surprise. > The original > issue that this patch tries to fix is the VRAM tracking which using the > global log dirty mode. And I thought the best solution to fix it is in VRAM > side not VT-d side. Because even use separate VT-d page table, we still cannot > track the memory update from DMA. Correct. But at least we can avoid IOMMU faults by not marking read- only the VRAM region. Unless the guest stores data in the VRAM region that's not directly displayed information, but needs to be preserved across migration, the only downside of this would be temporary screen corruption in the guest immediately following migration. Clearly far better than eventually turning off passed through devices due to excessive IOMMU faults. > Even worse, I think two page tables > introduce redundant code and maintain effort. So I wonder is it really > necessary to implement the separate VT-d large page? Tim and I certainly think so. Andrew had a valid point in stating that guests without the need for VRAM dirty tracking, but with assigned PCI device(s) could still benefit from sharing page tables, so working towards a model where this could be retained would clearly be the optimal solution. I wonder whether you actually too the time to go through the old thread before writing your mail, since all you did is repeat your old arguments, without addressing any of the reasons given why we consider separate page tables better than shared ones. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: George Dunlap <George.Dunlap@eu.citrix.com> To: "Zhang, Yang Z" <yang.z.zhang@intel.com> Cc: Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Jan Beulich <JBeulich@suse.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, "Keir Fraser \(keir.xen@gmail.com\)" <keir.xen@gmail.com>, "Nakajima, Jun" <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Mon, 19 May 2014 14:27:59 +0100 Message-ID: <CAFLBxZbn3X3EPjMyHqhNi4m18gYcTWnXK+Vwb3nC7PZUiXD2uw@mail.gmail.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
On Mon, May 19, 2014 at 8:48 AM, Zhang, Yang Z <yang.z.zhang@intel.com> wrote: > Tim Deegan wrote on 2014-02-14: >> At 15:55 +0000 on 13 Feb (1392303343), Jan Beulich wrote: >>>>>> On 13.02.14 at 16:46, George Dunlap >>>>>> <george.dunlap@eu.citrix.com> >> wrote: >>>> On 02/12/2014 12:53 AM, Zhang, Yang Z wrote: >>>>> George Dunlap wrote on 2014-02-11: >>>>>> I think I got a bit distracted with the "A isn't really so bad" thing. >>>>>> Actually, if the overhead of not sharing tables isn't very high, >>>>>> then B isn't such a bad option. In fact, B is what I expected >>>>>> Yang to submit when he originally described the problem. >>>>> Actually, the first solution came to my mind is B. Then I >>>>> realized that even >>>> chose B, we still cannot track the memory updating from DMA(even with >>>> A/D bit, it still a problem). Also, considering the current usage case >>>> of log dirty in Xen(only vram tracking has problem), I though A is >>>> better.: Hypervisor only need to track the vram change. If a malicious >>>> guest try to DMA to vram range, it only crashed himself (This should be >> reasonable). >>>>> >>>>>> I was going to say, from a release perspective, B is probably >>>>>> the safest option for now. But on the other hand, if we've been >>>>>> testing sharing all this time, maybe switching back over to >>>>>> non-sharing whole-hog has >>>> the higher risk? >>>>> Another problem with B is that current VT-d large paging >>>>> supporting relies on >>>> the sharing EPT and VT-d page table. This means if we choose B, >>>> then we need to re-enable VT-d large page. This would be a huge >>>> performance impaction for Xen 4.4 on using VT-d solution. >>>> >>>> OK -- if that's the case, then it definitely tips the balance back >>>> to A. Unless Tim or Jan disagrees, can one of you two check it in? >>>> >>>> Don't rush your judgement; but it would be nice to have this in >>>> before RC4, which would mean checking it in today preferrably, or >>>> early tomorrow at the latest. >>> >>> That would be Tim then, as he would have to approve of it anyway. >> >> Done. >> >>> I should also say that while I certainly understand the >>> argumentation above, I would still want to go this route only with >>> the promise that B is going to be worked on reasonably soon after >>> the release, ideally with the goal of backporting the changes for 4.4.1. >> >> Agreed. >> >> Tim. > > Hi all > > Sorry to turn out this old thread. > Because I just noticed that someone is asking when Intel will implement the VT-d page table separately. Actually, I am totally unaware it. The original issue that this patch tries to fix is the VRAM tracking which using the global log dirty mode. And I thought the best solution to fix it is in VRAM side not VT-d side. Because even use separate VT-d page table, we still cannot track the memory update from DMA. Even worse, I think two page tables introduce redundant code and maintain effort. So I wonder is it really necessary to implement the separate VT-d large page? Yes, it does introduce redundant code. But unfortunately, IOMMU faults at the moment have to be considered rather risky; having on happens risks (in order of decreasing probability / increasing damage): * Device stops working for that VM until an FLR (losing a lot of its state) * The VM has to be killed * The device stops working until a host reboot * The host crashes Avoiding these by "hoping" that the guest OS doesn't DMA into a video buffer isn't really robust enough. I think that was Tim and Jan's primary reason for wanting the ability to have separate tables for HAP and IOMMU. Is that about right, Jan / Tim? -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "Yang Z Zhang" <yang.z.zhang@intel.com>, "George Dunlap" <George.Dunlap@eu.citrix.com> Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>, "Keir Fraser \(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Xiantao Zhang <xiantao.zhang@intel.com>, Jun Nakajima <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Mon, 19 May 2014 14:50:39 +0100 Message-ID: <537A284F0200007800013ADC@mail.emea.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 19.05.14 at 15:27, <George.Dunlap@eu.citrix.com> wrote: > On Mon, May 19, 2014 at 8:48 AM, Zhang, Yang Z <yang.z.zhang@intel.com> wrote: >> Because I just noticed that someone is asking when Intel will implement the > VT-d page table separately. Actually, I am totally unaware it. The original > issue that this patch tries to fix is the VRAM tracking which using the > global log dirty mode. And I thought the best solution to fix it is in VRAM > side not VT-d side. Because even use separate VT-d page table, we still > cannot track the memory update from DMA. Even worse, I think two page tables > introduce redundant code and maintain effort. So I wonder is it really > necessary to implement the separate VT-d large page? > > Yes, it does introduce redundant code. But unfortunately, IOMMU > faults at the moment have to be considered rather risky; having on > happens risks (in order of decreasing probability / increasing > damage): > * Device stops working for that VM until an FLR (losing a lot of its state) > * The VM has to be killed > * The device stops working until a host reboot > * The host crashes > > Avoiding these by "hoping" that the guest OS doesn't DMA into a video > buffer isn't really robust enough. I think that was Tim and Jan's > primary reason for wanting the ability to have separate tables for HAP > and IOMMU. > > Is that about right, Jan / Tim? Yes, and not just "about" (perhaps with the exception that I think/ hope we don't have any lurking host crashes here). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: George Dunlap <george.dunlap@eu.citrix.com> To: Jan Beulich <JBeulich@suse.com>, Yang Z Zhang <yang.z.zhang@intel.com> Cc: Xiantao Zhang <xiantao.zhang@intel.com>, "Keir Fraser \(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Jun Nakajima <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Mon, 19 May 2014 14:59:49 +0100 Message-ID: <537A0E55.502@eu.citrix.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
On 05/19/2014 02:50 PM, Jan Beulich wrote: >>>> On 19.05.14 at 15:27, <George.Dunlap@eu.citrix.com> wrote: >> On Mon, May 19, 2014 at 8:48 AM, Zhang, Yang Z <yang.z.zhang@intel.com> wrote: >>> Because I just noticed that someone is asking when Intel will implement the >> VT-d page table separately. Actually, I am totally unaware it. The original >> issue that this patch tries to fix is the VRAM tracking which using the >> global log dirty mode. And I thought the best solution to fix it is in VRAM >> side not VT-d side. Because even use separate VT-d page table, we still >> cannot track the memory update from DMA. Even worse, I think two page tables >> introduce redundant code and maintain effort. So I wonder is it really >> necessary to implement the separate VT-d large page? >> >> Yes, it does introduce redundant code. But unfortunately, IOMMU >> faults at the moment have to be considered rather risky; having on >> happens risks (in order of decreasing probability / increasing >> damage): >> * Device stops working for that VM until an FLR (losing a lot of its state) >> * The VM has to be killed >> * The device stops working until a host reboot >> * The host crashes >> >> Avoiding these by "hoping" that the guest OS doesn't DMA into a video >> buffer isn't really robust enough. I think that was Tim and Jan's >> primary reason for wanting the ability to have separate tables for HAP >> and IOMMU. >> >> Is that about right, Jan / Tim? > Yes, and not just "about" (perhaps with the exception that I think/ > hope we don't have any lurking host crashes here). I think the fear was that buggy hardware might cause a host crash / hang. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "George Dunlap" <george.dunlap@eu.citrix.com>, "Yang Z Zhang" <yang.z.zhang@intel.com> Cc: Xiantao Zhang <xiantao.zhang@intel.com>, "Keir Fraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Jun Nakajima <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Mon, 19 May 2014 15:19:16 +0100 Message-ID: <537A2F040200007800013B6B@mail.emea.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 19.05.14 at 15:59, <george.dunlap@eu.citrix.com> wrote: > On 05/19/2014 02:50 PM, Jan Beulich wrote: >>>>> On 19.05.14 at 15:27, <George.Dunlap@eu.citrix.com> wrote: >>> On Mon, May 19, 2014 at 8:48 AM, Zhang, Yang Z <yang.z.zhang@intel.com> wrote: >>>> Because I just noticed that someone is asking when Intel will implement the >>> VT-d page table separately. Actually, I am totally unaware it. The original >>> issue that this patch tries to fix is the VRAM tracking which using the >>> global log dirty mode. And I thought the best solution to fix it is in VRAM >>> side not VT-d side. Because even use separate VT-d page table, we still >>> cannot track the memory update from DMA. Even worse, I think two page tables >>> introduce redundant code and maintain effort. So I wonder is it really >>> necessary to implement the separate VT-d large page? >>> >>> Yes, it does introduce redundant code. But unfortunately, IOMMU >>> faults at the moment have to be considered rather risky; having on >>> happens risks (in order of decreasing probability / increasing >>> damage): >>> * Device stops working for that VM until an FLR (losing a lot of its state) >>> * The VM has to be killed >>> * The device stops working until a host reboot >>> * The host crashes >>> >>> Avoiding these by "hoping" that the guest OS doesn't DMA into a video >>> buffer isn't really robust enough. I think that was Tim and Jan's >>> primary reason for wanting the ability to have separate tables for HAP >>> and IOMMU. >>> >>> Is that about right, Jan / Tim? >> Yes, and not just "about" (perhaps with the exception that I think/ >> hope we don't have any lurking host crashes here). > > I think the fear was that buggy hardware might cause a host crash / hang. That's a valid concern indeed. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Zhang, Yang Z" <yang.z.zhang@intel.com> To: Jan Beulich <JBeulich@suse.com> Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "Nakajima, Jun" <jun.nakajima@intel.com>, Keir Fraser <keir@xen.org>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, George Dunlap <george.dunlap@eu.citrix.com>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Tue, 20 May 2014 03:09:10 +0000 Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0AACDEA3@SHSMSX104.ccr.corp.intel.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
Jan Beulich wrote on 2014-05-19: >>>> On 19.05.14 at 09:48, <yang.z.zhang@intel.com> wrote: >> Because I just noticed that someone is asking when Intel will >> implement the VT-d page table separately. Actually, I am totally unaware it. > > This was a request sent directly to you, so it shouldn't be a surprise. Yes, but I am not buying in it since I think it not the right direction to fix this issue: This is my original point: Actually, the first solution came to my mind is B. Then I realized that even chose B, we still cannot track the memory updating from DMA(even with A/D bit, it still a problem). Also, considering the current usage case of log dirty in Xen(only vram tracking has problem), I though A is better.: Hypervisor only need to track the vram change. If a malicious guest try to DMA to vram range, it only crashed himself (This should be reasonable). > >> The original >> issue that this patch tries to fix is the VRAM tracking which using >> the global log dirty mode. And I thought the best solution to fix it >> is in VRAM side not VT-d side. Because even use separate VT-d page >> table, we still cannot track the memory update from DMA. > > Correct. But at least we can avoid IOMMU faults by not marking read- > only the VRAM region. Unless the guest stores data in the VRAM region It is easy to trigger an IOMMU faults by guest, like set an invalid DMA address. We cannot prevent it. > that's not directly displayed information, but needs to be preserved > across migration, the only downside of this would be temporary screen > corruption in the guest immediately following migration. Clearly far > better than eventually turning off passed through devices due to excessive IOMMU faults. > >> Even worse, I think two page tables >> introduce redundant code and maintain effort. So I wonder is it >> really necessary to implement the separate VT-d large page? > > Tim and I certainly think so. Andrew had a valid point in stating that > guests without the need for VRAM dirty tracking, but with assigned PCI > device(s) could still benefit from sharing page tables, so working > towards a model where this could be retained would clearly be the optimal solution. I still cannot understand what we want to achieve or fix by separate the page table. > > I wonder whether you actually too the time to go through the old > thread before writing your mail, since all you did is repeat your old > arguments, without addressing any of the reasons given why we consider > separate page tables better than shared ones. I have gone through it many times. But I cannot find a strong reason to persuade me to do it. I think it is a 'nice to have' feature, not required. Unless if anyone can show me that current sharing mechanism do have problem and it only can be fixed by separate page table, then I think we must to do it. > > Jan Best regards, Yang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Zhang, Yang Z" <yang.z.zhang@intel.com> To: George Dunlap <George.Dunlap@eu.citrix.com> Cc: "Keir Fraser \(keir.xen@gmail.com\)" <keir.xen@gmail.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, Jan Beulich <JBeulich@suse.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>, "Nakajima, Jun" <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Tue, 20 May 2014 03:13:20 +0000 Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0AACDED4@SHSMSX104.ccr.corp.intel.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
George Dunlap wrote on 2014-05-19: >> >> Hi all >> >> Sorry to turn out this old thread. >> Because I just noticed that someone is asking when Intel will >> implement the > VT-d page table separately. Actually, I am totally unaware it. The > original issue that this patch tries to fix is the VRAM tracking which > using the global log dirty mode. And I thought the best solution to fix it is in VRAM side not VT-d side. > Because even use separate VT-d page table, we still cannot track the > memory update from DMA. Even worse, I think two page tables introduce > redundant code and maintain effort. So I wonder is it really necessary > to implement the separate VT-d large page? > > Yes, it does introduce redundant code. But unfortunately, IOMMU faults > at the moment have to be considered rather risky; having on happens You cannot prevent guest to trigger the IOMMU faults. It is easy to trigger an IOMMU fault by setting a invalid DMA address. > risks (in order of decreasing probability / increasing damage): * Device > stops working for that VM until an FLR (losing a lot of its state) * The > VM has to be killed * The device stops working until a host reboot * The > host crashes > > Avoiding these by "hoping" that the guest OS doesn't DMA into a video > buffer isn't really robust enough. I think that was Tim and Jan's Video buffer is only one case. How we can prevent the DMA to other reserved region? > primary reason for wanting the ability to have separate tables for HAP and IOMMU. > > Is that about right, Jan / Tim? > > -George Best regards, Yang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "Yang Z Zhang" <yang.z.zhang@intel.com> Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>, George Dunlap <george.dunlap@eu.citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, Keir Fraser <keir@xen.org>, Jun Nakajima <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Tue, 20 May 2014 08:17:35 +0100 Message-ID: <537B1DAF0200007800013EAB@mail.emea.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 20.05.14 at 05:09, <yang.z.zhang@intel.com> wrote: > Jan Beulich wrote on 2014-05-19: >>>>> On 19.05.14 at 09:48, <yang.z.zhang@intel.com> wrote: >>> Because I just noticed that someone is asking when Intel will >>> implement the VT-d page table separately. Actually, I am totally unaware it. >> >> This was a request sent directly to you, so it shouldn't be a surprise. > > Yes, but I am not buying in it since I think it not the right direction to > fix this issue: > > This is my original point: > Actually, the first solution came to my mind is B. Then I realized that even > chose B, we still cannot track the memory updating from DMA(even with A/D > bit, it still a problem). Also, considering the current usage case of log > dirty in Xen(only vram tracking has problem), I though A is better.: > Hypervisor only need to track the vram change. If a malicious guest try to > DMA to vram range, it only crashed himself (This should be reasonable). Except that the guest in no way needs to be malicious. I think you forget that the ultimate goal of virtualization ought to be to make guests behave the same (in terms of correctness, not generally in terms of performance) as if run on real hardware, not matter what they do. And DMA to VRAM wouldn't crash on real hardware (and I can see legitimate uses of such). >>> The original >>> issue that this patch tries to fix is the VRAM tracking which using >>> the global log dirty mode. And I thought the best solution to fix it >>> is in VRAM side not VT-d side. Because even use separate VT-d page >>> table, we still cannot track the memory update from DMA. >> >> Correct. But at least we can avoid IOMMU faults by not marking read- >> only the VRAM region. Unless the guest stores data in the VRAM region > > It is easy to trigger an IOMMU faults by guest, like set an invalid DMA > address. We cannot prevent it. Correct (and this btw also contradicts above spelled out goal: On real hardware this would in the majority of cases also just cause certain operations to become ineffectual rather than bring down the system - we just have to be more rigid here to deal with potential malicious guests). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "Yang Z Zhang" <yang.z.zhang@intel.com> Cc: Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, George Dunlap <George.Dunlap@eu.citrix.com>, "Keir Fraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Jun Nakajima <jun.nakajima@intel.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Tue, 20 May 2014 08:20:18 +0100 Message-ID: <537B1E520200007800013EB7@mail.emea.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 20.05.14 at 05:13, <yang.z.zhang@intel.com> wrote: > George Dunlap wrote on 2014-05-19: >> Avoiding these by "hoping" that the guest OS doesn't DMA into a video >> buffer isn't really robust enough. I think that was Tim and Jan's > > Video buffer is only one case. How we can prevent the DMA to other reserved > region? You continue to neglect the difference: Accessing VRAM this way is legitimate (and potentially useful). And - as just said in the other reply - ideally we'd also simply ignore accesses to reserved regions (and in fact we try to, by not immediately bringing down a guest device doing such). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: George Dunlap <George.Dunlap@eu.citrix.com> To: Jan Beulich <JBeulich@suse.com> Cc: Xiantao Zhang <xiantao.zhang@intel.com>, Yang Z Zhang <yang.z.zhang@intel.com>, "Keir Fraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Jun Nakajima <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Tue, 20 May 2014 11:12:41 +0100 Message-ID: <CAFLBxZYgx10bwKz1fQ_2vTs2mE5DvxgBUwmbEE5k1toH84robQ@mail.gmail.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
On Tue, May 20, 2014 at 8:20 AM, Jan Beulich <JBeulich@suse.com> wrote: >>>> On 20.05.14 at 05:13, <yang.z.zhang@intel.com> wrote: >> George Dunlap wrote on 2014-05-19: >>> Avoiding these by "hoping" that the guest OS doesn't DMA into a video >>> buffer isn't really robust enough. I think that was Tim and Jan's >> >> Video buffer is only one case. How we can prevent the DMA to other reserved >> region? > > You continue to neglect the difference: Accessing VRAM this way is > legitimate (and potentially useful). And - as just said in the other > reply - ideally we'd also simply ignore accesses to reserved regions > (and in fact we try to, by not immediately bringing down a guest > device doing such). On the other hand, just to play devil's advocate here: Implementing separate IOMMU tables (including superpages) isn't free; it has a non-negligible cost, both in initial developer time, continuing maintenance (code complexity, fixing bugs), extra memory at run-time, &c. Of all the things we could invest that developer time doing, why should we make it possible to DMA into VRAM, rather than doing something else? -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "George Dunlap" <George.Dunlap@eu.citrix.com> Cc: Xiantao Zhang <xiantao.zhang@intel.com>, Yang Z Zhang <yang.z.zhang@intel.com>, "Keir Fraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Jun Nakajima <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Tue, 20 May 2014 11:46:31 +0100 Message-ID: <537B4EA70200007800014085@mail.emea.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 20.05.14 at 12:12, <George.Dunlap@eu.citrix.com> wrote: > On Tue, May 20, 2014 at 8:20 AM, Jan Beulich <JBeulich@suse.com> wrote: >>>>> On 20.05.14 at 05:13, <yang.z.zhang@intel.com> wrote: >>> George Dunlap wrote on 2014-05-19: >>>> Avoiding these by "hoping" that the guest OS doesn't DMA into a video >>>> buffer isn't really robust enough. I think that was Tim and Jan's >>> >>> Video buffer is only one case. How we can prevent the DMA to other reserved >>> region? >> >> You continue to neglect the difference: Accessing VRAM this way is >> legitimate (and potentially useful). And - as just said in the other >> reply - ideally we'd also simply ignore accesses to reserved regions >> (and in fact we try to, by not immediately bringing down a guest >> device doing such). > > On the other hand, just to play devil's advocate here: Implementing > separate IOMMU tables (including superpages) isn't free; it has a > non-negligible cost, both in initial developer time, continuing > maintenance (code complexity, fixing bugs), extra memory at run-time, > &c. > > Of all the things we could invest that developer time doing, why > should we make it possible to DMA into VRAM, rather than doing > something else? While I agree that the question is valid, my position really is that it was a mistake to implement the IOMMU code without superpage support, i.e. I view this as a shortcoming independent of the VRAM issue, and I would want to see this fixed rather sooner than later. Had it been done properly from the beginning (like one would expect for non-experimental code), a lot of this discussion could have been avoided, and we wouldn't have had to take the respective workaround close to the 4.4 release. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Zhang, Yang Z" <yang.z.zhang@intel.com> To: George Dunlap <George.Dunlap@eu.citrix.com>, Jan Beulich <JBeulich@suse.com> Cc: "Zhang, Xiantao" <xiantao.zhang@intel.com>, "Keir Fraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "Nakajima, Jun" <jun.nakajima@intel.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Wed, 21 May 2014 01:02:41 +0000 Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0AACF345@SHSMSX104.ccr.corp.intel.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
Jan Beulich wrote on 2014-05-20: >>>> On 20.05.14 at 12:12, <George.Dunlap@eu.citrix.com> wrote: >> On Tue, May 20, 2014 at 8:20 AM, Jan Beulich <JBeulich@suse.com> wrote: >>>>>> On 20.05.14 at 05:13, <yang.z.zhang@intel.com> wrote: >>>> George Dunlap wrote on 2014-05-19: >>>>> Avoiding these by "hoping" that the guest OS doesn't DMA into a >>>>> video buffer isn't really robust enough. I think that was Tim >>>>> and Jan's >>>> >>>> Video buffer is only one case. How we can prevent the DMA to other >>>> reserved region? >>> >>> You continue to neglect the difference: Accessing VRAM this way is >>> legitimate (and potentially useful). And - as just said in the >>> other reply - ideally we'd also simply ignore accesses to reserved Can you give an example of what legitmate case you are mentioned twice(You mentioned it also in other reply)? I cannot understand why we need to restrict the CPU access to VRAM region but allow accessing from device. As I known, for gfx passthrough, both device and CPU are able to access them. And for emulated gfx, only software will access it which same as current we see in Xen. >>> regions (and in fact we try to, by not immediately bringing down a >>> guest device doing such). >> >> On the other hand, just to play devil's advocate here: Implementing >> separate IOMMU tables (including superpages) isn't free; it has a >> non-negligible cost, both in initial developer time, continuing >> maintenance (code complexity, fixing bugs), extra memory at >> run-time, &c. >> >> Of all the things we could invest that developer time doing, why >> should we make it possible to DMA into VRAM, rather than doing >> something else? > > While I agree that the question is valid, my position really is that > it was a mistake to implement the IOMMU code without superpage We support the superpage via sharing EPT and VT-d pagetable. > support, i.e. I view this as a shortcoming independent of the VRAM > issue, and I would want to see this fixed rather sooner than later. > Had it been done properly from the beginning (like one would expect > for non-experimental code), a lot of this discussion could have been > avoided, and we wouldn't have had to take the respective workaround > close to the 4.4 release. I still think the best solution is fixing the VRAM global log dirty mechanism which my previous patch already did. Because I cannot see any benefit with separating the page table. > > Jan Best regards, Yang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "Yang Z Zhang" <yang.z.zhang@intel.com> Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Jun Nakajima <jun.nakajima@intel.com>, "KeirFraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Xiantao Zhang <xiantao.zhang@intel.com>, George Dunlap <George.Dunlap@eu.citrix.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Wed, 21 May 2014 08:49:15 +0100 Message-ID: <537C769B02000078000145C2@mail.emea.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 21.05.14 at 03:02, <yang.z.zhang@intel.com> wrote: > Jan Beulich wrote on 2014-05-20: >>>>> On 20.05.14 at 12:12, <George.Dunlap@eu.citrix.com> wrote: >>> On Tue, May 20, 2014 at 8:20 AM, Jan Beulich <JBeulich@suse.com> wrote: >>>>>>> On 20.05.14 at 05:13, <yang.z.zhang@intel.com> wrote: >>>>> George Dunlap wrote on 2014-05-19: >>>>>> Avoiding these by "hoping" that the guest OS doesn't DMA into a >>>>>> video buffer isn't really robust enough. I think that was Tim >>>>>> and Jan's >>>>> >>>>> Video buffer is only one case. How we can prevent the DMA to other >>>>> reserved region? >>>> >>>> You continue to neglect the difference: Accessing VRAM this way is >>>> legitimate (and potentially useful). And - as just said in the >>>> other reply - ideally we'd also simply ignore accesses to reserved > > Can you give an example of what legitmate case you are mentioned twice(You > mentioned it also in other reply)? What's wrong with loading some graphics data right from an I/O device (disk, network) into the frame buffer? > I cannot understand why we need to > restrict the CPU access to VRAM region but allow accessing from device. We'd love to also do this for device accesses, but to date IOMMU fault are unrecoverable, and hence this is not an option. > As I > known, for gfx passthrough, both device and CPU are able to access them. And > for emulated gfx, only software will access it which same as current we see > in Xen. Why's that? I.e. why should a guest with emulated graphics not be able to program a passed through device to access the VRAM directly? >>>> regions (and in fact we try to, by not immediately bringing down a >>>> guest device doing such). >>> >>> On the other hand, just to play devil's advocate here: Implementing >>> separate IOMMU tables (including superpages) isn't free; it has a >>> non-negligible cost, both in initial developer time, continuing >>> maintenance (code complexity, fixing bugs), extra memory at >>> run-time, &c. >>> >>> Of all the things we could invest that developer time doing, why >>> should we make it possible to DMA into VRAM, rather than doing >>> something else? >> >> While I agree that the question is valid, my position really is that >> it was a mistake to implement the IOMMU code without superpage > > We support the superpage via sharing EPT and VT-d pagetable. I.e. without EPT no superpages. Afaict there's no such limitation on the AMD side. >> support, i.e. I view this as a shortcoming independent of the VRAM >> issue, and I would want to see this fixed rather sooner than later. >> Had it been done properly from the beginning (like one would expect >> for non-experimental code), a lot of this discussion could have been >> avoided, and we wouldn't have had to take the respective workaround >> close to the 4.4 release. > > I still think the best solution is fixing the VRAM global log dirty > mechanism which my previous patch already did. Because I cannot see any > benefit with separating the page table. You didn't in any way negate the condition of superpage support to be added post-4.4 in order for your other change to go in: Neither http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01230.html nor http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01236.html have been responded to by you. By not doing so, to me at least you implicitly accepted the condition. And by now refusing to meet it, you basically tell us that we shouldn't be doing compromises like this with you in the future. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Zhang, Yang Z" <yang.z.zhang@intel.com> To: Jan Beulich <JBeulich@suse.com> Cc: "Nakajima, Jun" <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, George Dunlap <George.Dunlap@eu.citrix.com>, "KeirFraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Wed, 21 May 2014 08:37:35 +0000 Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0AACF97A@SHSMSX104.ccr.corp.intel.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
Jan Beulich wrote on 2014-05-21: >>>> On 21.05.14 at 03:02, <yang.z.zhang@intel.com> wrote: >> Jan Beulich wrote on 2014-05-20: >>>>>> On 20.05.14 at 12:12, <George.Dunlap@eu.citrix.com> wrote: >>>> On Tue, May 20, 2014 at 8:20 AM, Jan Beulich <JBeulich@suse.com> > wrote: >>>>>>>> On 20.05.14 at 05:13, <yang.z.zhang@intel.com> wrote: >>>>>> George Dunlap wrote on 2014-05-19: >>>>>>> Avoiding these by "hoping" that the guest OS doesn't DMA into a >>>>>>> video buffer isn't really robust enough. I think that was Tim >>>>>>> and Jan's >>>>>> >>>>>> Video buffer is only one case. How we can prevent the DMA to other >>>>>> reserved region? >>>>> >>>>> You continue to neglect the difference: Accessing VRAM this way is >>>>> legitimate (and potentially useful). And - as just said in the >>>>> other reply - ideally we'd also simply ignore accesses to reserved >> >> Can you give an example of what legitmate case you are mentioned twice(You >> mentioned it also in other reply)? > > What's wrong with loading some graphics data right from an I/O device > (disk, network) into the frame buffer? Yes, it is ok. But we are talking about the DMA access to a 'readonly' buffer. It is totally a wrong usage model to me. > >> I cannot understand why we need to >> restrict the CPU access to VRAM region but allow accessing from device. > > We'd love to also do this for device accesses, but to date IOMMU > fault are unrecoverable, and hence this is not an option. > >> As I known, for gfx passthrough, both device and CPU are able to access >> them. And for emulated gfx, only software will access it which same as >> current we see in Xen. > > Why's that? I.e. why should a guest with emulated graphics not be > able to program a passed through device to access the VRAM directly? I think you confused the background of the issue: the essence of the issue is that we marked all pages as read-only and meanwhile we allow DMA to access those page. So the right solution is only mark the required page (in this case means VRAM region) as readonly and if guest still DMA to that region, then any unpredictable behavior to that guest is reasonable. What you said before totally ignore the 'read-only' thing. If there is no read-only operation, there will be no problem and no discussion around it. > >>>>> regions (and in fact we try to, by not immediately bringing down a >>>>> guest device doing such). >>>> >>>> On the other hand, just to play devil's advocate here: Implementing >>>> separate IOMMU tables (including superpages) isn't free; it has a >>>> non-negligible cost, both in initial developer time, continuing >>>> maintenance (code complexity, fixing bugs), extra memory at >>>> run-time, &c. >>>> >>>> Of all the things we could invest that developer time doing, why >>>> should we make it possible to DMA into VRAM, rather than doing >>>> something else? >>> >>> While I agree that the question is valid, my position really is that >>> it was a mistake to implement the IOMMU code without superpage >> >> We support the superpage via sharing EPT and VT-d pagetable. > > I.e. without EPT no superpages. Afaict there's no such limitation on > the AMD side. I don't think this is a 'limitation'. The original motivation to sharing page table is to use memory efficiently and easy for code maintain. And when we tried to add superpage support, we follow this way (we only need to change one copy of page table). But you think it is a limitation. I cannot understand. > >>> support, i.e. I view this as a shortcoming independent of the VRAM >>> issue, and I would want to see this fixed rather sooner than later. >>> Had it been done properly from the beginning (like one would expect >>> for non-experimental code), a lot of this discussion could have been >>> avoided, and we wouldn't have had to take the respective workaround >>> close to the 4.4 release. >> >> I still think the best solution is fixing the VRAM global log dirty >> mechanism which my previous patch already did. Because I cannot see any >> benefit with separating the page table. > > You didn't in any way negate the condition of superpage support to be > added post-4.4 in order for your other change to go in: Neither > http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01230.html > nor > http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01236.html > have been responded to by you. By not doing so, to me at least you > implicitly accepted the condition. And by now refusing to meet it, you > basically tell us that we shouldn't be doing compromises like this with > you in the future. I have said before I am totally unware of it. And I know it only two days ago after someone told me. So please do not confuse this with the thing what we are discussing now. If you think I gave a promise implicitly at that time, I am sorry to let you think so. As I said in previous thread, if we can prove that add hugepage for the separate VT-d page table is the only choice to solve problem, then now I am promising that I will do it ASAP. But till now, I didn't see any point that we must to have it. To me, it is still a nice to have feature. > > Jan Best regards, Yang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "Yang Z Zhang" <yang.z.zhang@intel.com> Cc: George Dunlap <George.Dunlap@eu.citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, "KeirFraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Jun Nakajima <jun.nakajima@intel.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Wed, 21 May 2014 10:58:01 +0100 Message-ID: <537C94C902000078000146F5@mail.emea.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 21.05.14 at 10:37, <yang.z.zhang@intel.com> wrote: > Jan Beulich wrote on 2014-05-21: >>>>> On 21.05.14 at 03:02, <yang.z.zhang@intel.com> wrote: >>> Jan Beulich wrote on 2014-05-20: >>>>>>> On 20.05.14 at 12:12, <George.Dunlap@eu.citrix.com> wrote: >>>>> On Tue, May 20, 2014 at 8:20 AM, Jan Beulich <JBeulich@suse.com> >> wrote: >>>>>>>>> On 20.05.14 at 05:13, <yang.z.zhang@intel.com> wrote: >>>>>>> George Dunlap wrote on 2014-05-19: >>>>>>>> Avoiding these by "hoping" that the guest OS doesn't DMA into a >>>>>>>> video buffer isn't really robust enough. I think that was Tim >>>>>>>> and Jan's >>>>>>> >>>>>>> Video buffer is only one case. How we can prevent the DMA to other >>>>>>> reserved region? >>>>>> >>>>>> You continue to neglect the difference: Accessing VRAM this way is >>>>>> legitimate (and potentially useful). And - as just said in the >>>>>> other reply - ideally we'd also simply ignore accesses to reserved >>> >>> Can you give an example of what legitmate case you are mentioned twice(You >>> mentioned it also in other reply)? >> >> What's wrong with loading some graphics data right from an I/O device >> (disk, network) into the frame buffer? > > Yes, it is ok. But we are talking about the DMA access to a 'readonly' > buffer. It is totally a wrong usage model to me. What read-only buffer are you referring to? From guest perspective, nothing that gets marked read-only due to log-dirty handling really is read-only. The fact that migration doesn't work due to this with an assigned device can be excused by the handling of such an assigned device not being implemented anyway. But the fact that behind the scenes VRAM gets marked read-only should be (but isn't) entirely transparent to the guest. I'm not going to reply to the rest of your mail, both because I'm getting the impression that we're moving in circles, and because I might become impolite otherwise. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "Yang Z Zhang" <yang.z.zhang@intel.com> Cc: Jun Nakajima <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "KeirFraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Xiantao Zhang <xiantao.zhang@intel.com>, George Dunlap <George.Dunlap@eu.citrix.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Fri, 23 May 2014 07:42:20 +0100 Message-ID: <537F09EC020000780001530D@mail.emea.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 21.05.14 at 10:37, <yang.z.zhang@intel.com> wrote: > Jan Beulich wrote on 2014-05-21: >> You didn't in any way negate the condition of superpage support to be >> added post-4.4 in order for your other change to go in: Neither >> http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01230.html >> nor >> http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01236.html >> have been responded to by you. By not doing so, to me at least you >> implicitly accepted the condition. And by now refusing to meet it, you >> basically tell us that we shouldn't be doing compromises like this with >> you in the future. > > I have said before I am totally unware of it. And I know it only two days > ago after someone told me. So please do not confuse this with the thing what > we are discussing now. If you think I gave a promise implicitly at that time, > I am sorry to let you think so. > > As I said in previous thread, if we can prove that add hugepage for the > separate VT-d page table is the only choice to solve problem, then now I am > promising that I will do it ASAP. But till now, I didn't see any point that > we must to have it. To me, it is still a nice to have feature. Btw., I think I just spotted a second thing not working without split page tables: mem-access (which doesn't and imo shouldn't depend on !need_iommu(), other than mem-sharing and mem-paging) likewise has the potential of creating entries resulting in IOMMU faults. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Zhang, Yang Z" <yang.z.zhang@intel.com> To: Jan Beulich <JBeulich@suse.com> Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "Nakajima, Jun" <jun.nakajima@intel.com>, George Dunlap <George.Dunlap@eu.citrix.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, "KeirFraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Mon, 26 May 2014 08:16:43 +0000 Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0AAD38E7@SHSMSX104.ccr.corp.intel.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
Jan Beulich wrote on 2014-05-23: >>>> On 21.05.14 at 10:37, <yang.z.zhang@intel.com> wrote: >> Jan Beulich wrote on 2014-05-21: >>> You didn't in any way negate the condition of superpage support to >>> be added post-4.4 in order for your other change to go in: Neither >>> http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01230. >>> html >>> nor >>> http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01236. >>> html have been responded to by you. By not doing so, to me at >>> least you implicitly accepted the condition. And by now refusing to >>> meet it, you basically tell us that we shouldn't be doing >>> compromises like this with you in the future. >> >> I have said before I am totally unware of it. And I know it only two >> days ago after someone told me. So please do not confuse this with >> the thing what we are discussing now. If you think I gave a promise >> implicitly at that time, I am sorry to let you think so. >> >> As I said in previous thread, if we can prove that add hugepage for >> the separate VT-d page table is the only choice to solve problem, >> then now I am promising that I will do it ASAP. But till now, I >> didn't see any point that we must to have it. To me, it is still a nice to have feature. > > Btw., I think I just spotted a second thing not working without split page tables: > mem-access (which doesn't and imo shouldn't depend on !need_iommu(), > other than mem-sharing and mem-paging) likewise has the potential of > creating entries resulting in IOMMU faults. > I don't know what mem-access is? Do you mean Xenaccess? If not, can you elaborate it or provide a link to help me to understand how it works? > Jan Best regards, Yang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "Yang Z Zhang" <yang.z.zhang@intel.com> Cc: Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, George Dunlap <George.Dunlap@eu.citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, "KeirFraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Jun Nakajima <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Mon, 26 May 2014 10:04:57 +0100 Message-ID: <53831FD902000078000159F9@mail.emea.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 26.05.14 at 10:16, <yang.z.zhang@intel.com> wrote: > Jan Beulich wrote on 2014-05-23: >>>>> On 21.05.14 at 10:37, <yang.z.zhang@intel.com> wrote: >>> Jan Beulich wrote on 2014-05-21: >>>> You didn't in any way negate the condition of superpage support to >>>> be added post-4.4 in order for your other change to go in: Neither >>>> http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01230. >>>> html >>>> nor >>>> http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01236. >>>> html have been responded to by you. By not doing so, to me at >>>> least you implicitly accepted the condition. And by now refusing to >>>> meet it, you basically tell us that we shouldn't be doing >>>> compromises like this with you in the future. >>> >>> I have said before I am totally unware of it. And I know it only two >>> days ago after someone told me. So please do not confuse this with >>> the thing what we are discussing now. If you think I gave a promise >>> implicitly at that time, I am sorry to let you think so. >>> >>> As I said in previous thread, if we can prove that add hugepage for >>> the separate VT-d page table is the only choice to solve problem, >>> then now I am promising that I will do it ASAP. But till now, I >>> didn't see any point that we must to have it. To me, it is still a nice to > have feature. >> >> Btw., I think I just spotted a second thing not working without split page > tables: >> mem-access (which doesn't and imo shouldn't depend on !need_iommu(), >> other than mem-sharing and mem-paging) likewise has the potential of >> creating entries resulting in IOMMU faults. >> > > I don't know what mem-access is? Do you mean Xenaccess? If not, can you > elaborate it or provide a link to help me to understand how it works? The (example) tool indeed is named xen-access. See XENMEM_access_op (used to be HVMOP_{get,set}_mem_access). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Nakajima, Jun" <jun.nakajima@intel.com> To: Jan Beulich <JBeulich@suse.com> Cc: Yang Z Zhang <yang.z.zhang@intel.com>, "KeirFraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, George Dunlap <George.Dunlap@eu.citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Fri, 30 May 2014 18:26:18 -0700 Message-ID: <CAL54oT3edU8dE+gGSbaPerkuhdgTchW4n=j3+syzuc9xEGxmcQ@mail.gmail.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
On Mon, May 26, 2014 at 2:04 AM, Jan Beulich <JBeulich@suse.com> wrote: >>>> On 26.05.14 at 10:16, <yang.z.zhang@intel.com> wrote: >> Jan Beulich wrote on 2014-05-23: >>>>>> On 21.05.14 at 10:37, <yang.z.zhang@intel.com> wrote: >>>> Jan Beulich wrote on 2014-05-21: >>>>> You didn't in any way negate the condition of superpage support to >>>>> be added post-4.4 in order for your other change to go in: Neither >>>>> http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01230. >>>>> html >>>>> nor >>>>> http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01236. >>>>> html have been responded to by you. By not doing so, to me at >>>>> least you implicitly accepted the condition. And by now refusing to >>>>> meet it, you basically tell us that we shouldn't be doing >>>>> compromises like this with you in the future. >>>> >>>> I have said before I am totally unware of it. And I know it only two >>>> days ago after someone told me. So please do not confuse this with >>>> the thing what we are discussing now. If you think I gave a promise >>>> implicitly at that time, I am sorry to let you think so. >>>> >>>> As I said in previous thread, if we can prove that add hugepage for >>>> the separate VT-d page table is the only choice to solve problem, >>>> then now I am promising that I will do it ASAP. But till now, I >>>> didn't see any point that we must to have it. To me, it is still a nice to >> have feature. >>> >>> Btw., I think I just spotted a second thing not working without split page >> tables: >>> mem-access (which doesn't and imo shouldn't depend on !need_iommu(), >>> other than mem-sharing and mem-paging) likewise has the potential of >>> creating entries resulting in IOMMU faults. >>> >> >> I don't know what mem-access is? Do you mean Xenaccess? If not, can you >> elaborate it or provide a link to help me to understand how it works? > > The (example) tool indeed is named xen-access. See XENMEM_access_op > (used to be HVMOP_{get,set}_mem_access). > The tool xen-access is located in tools/tests, and I think that this is used mostly by developers who know what they are doing. If we had separate VT-d page tables, they might observe confusing results; even if they write-protect pages, somebody (i.e. I/O devices) modifies those pages. To me, observing IOMMU faults seems consistent with the consequence of changes to guest memory permission. -- Jun Intel Open Source Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "Jun Nakajima" <jun.nakajima@intel.com> Cc: Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, George Dunlap <George.Dunlap@eu.citrix.com>, Yang Z Zhang <yang.z.zhang@intel.com>, "KeirFraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Mon, 02 Jun 2014 07:55:00 +0100 Message-ID: <538C3BE40200007800016AE1@mail.emea.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 31.05.14 at 03:26, <jun.nakajima@intel.com> wrote: > On Mon, May 26, 2014 at 2:04 AM, Jan Beulich <JBeulich@suse.com> wrote: >>>>> On 26.05.14 at 10:16, <yang.z.zhang@intel.com> wrote: >>> Jan Beulich wrote on 2014-05-23: >>>> Btw., I think I just spotted a second thing not working without split page >>> tables: >>>> mem-access (which doesn't and imo shouldn't depend on !need_iommu(), >>>> other than mem-sharing and mem-paging) likewise has the potential of >>>> creating entries resulting in IOMMU faults. >>>> >>> >>> I don't know what mem-access is? Do you mean Xenaccess? If not, can you >>> elaborate it or provide a link to help me to understand how it works? >> >> The (example) tool indeed is named xen-access. See XENMEM_access_op >> (used to be HVMOP_{get,set}_mem_access). >> > > The tool xen-access is located in tools/tests, and I think that this > is used mostly by developers who know what they are doing. The tool is, indeed. But the underlying feature clearly isn't limited to or solely intended for developers. > If we had separate VT-d page tables, they might observe confusing > results; even if they write-protect pages, somebody (i.e. I/O devices) > modifies those pages. > To me, observing IOMMU faults seems consistent with the consequence of > changes to guest memory permission. And I would agree if these faults were restartable. You're certainly aware that a not too large amount of faults within a reasonably short period of time will lead to the device being turned off, with quite likely fatal consequences to the guest. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: George Dunlap <George.Dunlap@eu.citrix.com> To: Jan Beulich <JBeulich@suse.com> Cc: Jun Nakajima <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, Yang Z Zhang <yang.z.zhang@intel.com>, "KeirFraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Mon, 2 Jun 2014 15:06:17 +0100 Message-ID: <CAFLBxZYUp_Kws0-M9wgoyKvM6G_SwD1+DGwTECAK0M5-_h1OTQ@mail.gmail.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
On Mon, Jun 2, 2014 at 7:55 AM, Jan Beulich <JBeulich@suse.com> wrote: >>>> On 31.05.14 at 03:26, <jun.nakajima@intel.com> wrote: >> On Mon, May 26, 2014 at 2:04 AM, Jan Beulich <JBeulich@suse.com> wrote: >>>>>> On 26.05.14 at 10:16, <yang.z.zhang@intel.com> wrote: >>>> Jan Beulich wrote on 2014-05-23: >>>>> Btw., I think I just spotted a second thing not working without split page >>>> tables: >>>>> mem-access (which doesn't and imo shouldn't depend on !need_iommu(), >>>>> other than mem-sharing and mem-paging) likewise has the potential of >>>>> creating entries resulting in IOMMU faults. >>>>> >>>> >>>> I don't know what mem-access is? Do you mean Xenaccess? If not, can you >>>> elaborate it or provide a link to help me to understand how it works? >>> >>> The (example) tool indeed is named xen-access. See XENMEM_access_op >>> (used to be HVMOP_{get,set}_mem_access). >>> >> >> The tool xen-access is located in tools/tests, and I think that this >> is used mostly by developers who know what they are doing. > > The tool is, indeed. But the underlying feature clearly isn't limited > to or solely intended for developers. > >> If we had separate VT-d page tables, they might observe confusing >> results; even if they write-protect pages, somebody (i.e. I/O devices) >> modifies those pages. >> To me, observing IOMMU faults seems consistent with the consequence of >> changes to guest memory permission. > > And I would agree if these faults were restartable. You're certainly > aware that a not too large amount of faults within a reasonably short > period of time will lead to the device being turned off, with quite likely > fatal consequences to the guest. Sure -- but there are a number of features (PoD, paging, page sharing, even migration) which are incompatible with pass-through, and the user is simply not allowed to use them together. Why not just add this one to the list? -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: "Jan Beulich" <JBeulich@suse.com> To: "George Dunlap" <George.Dunlap@eu.citrix.com> Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Jun Nakajima <jun.nakajima@intel.com>, Yang Z Zhang <yang.z.zhang@intel.com>, "KeirFraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Xiantao Zhang <xiantao.zhang@intel.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Mon, 02 Jun 2014 15:27:20 +0100 Message-ID: <538CA5E80200007800016E34@mail.emea.novell.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
>>> On 02.06.14 at 16:06, <George.Dunlap@eu.citrix.com> wrote: > On Mon, Jun 2, 2014 at 7:55 AM, Jan Beulich <JBeulich@suse.com> wrote: >>>>> On 31.05.14 at 03:26, <jun.nakajima@intel.com> wrote: >>> On Mon, May 26, 2014 at 2:04 AM, Jan Beulich <JBeulich@suse.com> wrote: >>>>>>> On 26.05.14 at 10:16, <yang.z.zhang@intel.com> wrote: >>>>> Jan Beulich wrote on 2014-05-23: >>>>>> Btw., I think I just spotted a second thing not working without split page >>>>> tables: >>>>>> mem-access (which doesn't and imo shouldn't depend on !need_iommu(), >>>>>> other than mem-sharing and mem-paging) likewise has the potential of >>>>>> creating entries resulting in IOMMU faults. >>>>>> >>>>> >>>>> I don't know what mem-access is? Do you mean Xenaccess? If not, can you >>>>> elaborate it or provide a link to help me to understand how it works? >>>> >>>> The (example) tool indeed is named xen-access. See XENMEM_access_op >>>> (used to be HVMOP_{get,set}_mem_access). >>>> >>> >>> The tool xen-access is located in tools/tests, and I think that this >>> is used mostly by developers who know what they are doing. >> >> The tool is, indeed. But the underlying feature clearly isn't limited >> to or solely intended for developers. >> >>> If we had separate VT-d page tables, they might observe confusing >>> results; even if they write-protect pages, somebody (i.e. I/O devices) >>> modifies those pages. >>> To me, observing IOMMU faults seems consistent with the consequence of >>> changes to guest memory permission. >> >> And I would agree if these faults were restartable. You're certainly >> aware that a not too large amount of faults within a reasonably short >> period of time will lead to the device being turned off, with quite likely >> fatal consequences to the guest. > > Sure -- but there are a number of features (PoD, paging, page sharing, > even migration) which are incompatible with pass-through, and the user > is simply not allowed to use them together. Why not just add this one > to the list? Considering that mem-access came at about the same time or even later than mem-paging and mem-sharing, I was silently assuming that it wasn't enforcing the same restrictions as the other two for a reason. Maybe it was just forgotten... Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
From: George Dunlap <george.dunlap@eu.citrix.com> To: Jan Beulich <JBeulich@suse.com> Cc: Xiantao Zhang <xiantao.zhang@intel.com>, Yang Z Zhang <yang.z.zhang@intel.com>, "KeirFraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Jun Nakajima <jun.nakajima@intel.com> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram Date: Mon, 2 Jun 2014 16:03:20 +0100 Message-ID: <538C9238.5080701@eu.citrix.com>
[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]
On 06/02/2014 03:27 PM, Jan Beulich wrote: >>>> On 02.06.14 at 16:06, <George.Dunlap@eu.citrix.com> wrote: >> On Mon, Jun 2, 2014 at 7:55 AM, Jan Beulich <JBeulich@suse.com> wrote: >>>>>> On 31.05.14 at 03:26, <jun.nakajima@intel.com> wrote: >>>> On Mon, May 26, 2014 at 2:04 AM, Jan Beulich <JBeulich@suse.com> wrote: >>>>>>>> On 26.05.14 at 10:16, <yang.z.zhang@intel.com> wrote: >>>>>> Jan Beulich wrote on 2014-05-23: >>>>>>> Btw., I think I just spotted a second thing not working without split page >>>>>> tables: >>>>>>> mem-access (which doesn't and imo shouldn't depend on !need_iommu(), >>>>>>> other than mem-sharing and mem-paging) likewise has the potential of >>>>>>> creating entries resulting in IOMMU faults. >>>>>>> >>>>>> I don't know what mem-access is? Do you mean Xenaccess? If not, can you >>>>>> elaborate it or provide a link to help me to understand how it works? >>>>> The (example) tool indeed is named xen-access. See XENMEM_access_op >>>>> (used to be HVMOP_{get,set}_mem_access). >>>>> >>>> The tool xen-access is located in tools/tests, and I think that this >>>> is used mostly by developers who know what they are doing. >>> The tool is, indeed. But the underlying feature clearly isn't limited >>> to or solely intended for developers. >>> >>>> If we had separate VT-d page tables, they might observe confusing >>>> results; even if they write-protect pages, somebody (i.e. I/O devices) >>>> modifies those pages. >>>> To me, observing IOMMU faults seems consistent with the consequence of >>>> changes to guest memory permission. >>> And I would agree if these faults were restartable. You're certainly >>> aware that a not too large amount of faults within a reasonably short >>> period of time will lead to the device being turned off, with quite likely >>> fatal consequences to the guest. >> Sure -- but there are a number of features (PoD, paging, page sharing, >> even migration) which are incompatible with pass-through, and the user >> is simply not allowed to use them together. Why not just add this one >> to the list? > Considering that mem-access came at about the same time or even > later than mem-paging and mem-sharing, I was silently assuming that > it wasn't enforcing the same restrictions as the other two for a reason. > Maybe it was just forgotten... Well one can certainly imagine someone wanting to run it in that mode, and "knowing" that it would be safe because they were going to be careful not going to use memaccess on pages on which a DMA was going to happen. I think in that situation, a developer would probably want to know that their assumption was wrong immediately (by having the IOMMU fault perhaps crash the domain), than have to figure out that their assumption was wrong by having unexplainable results. If memaccess + passthru is enabled by default at the moment, there may be an argument for disabling it by default, perhaps with an override for those who want the ability to shoot themselves in the foot. But I don't think that silent violations of the memaccess "promise" is better than just having the IOMMU faults; so I don't think that's an argument for implementing non-shared IOMMU superpages. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel